Editor’s note: This is one installment in a series about effectively managing BigQuery costs. Check out the other posts on choosing between BigQuery pricing models and how to properly size your slots.BigQuery has several built-in features and capabilities to help you save on costs, manage spend, and get the most out of your data warehouse resources. In this blog, we’ll dive into Reservations, BigQuery’s platform for cost and workload management. In short, BigQuery Reservations enables you to:Quickly purchase and deploy BigQuery slots Assign slots to various parts of your organizationSwitch your organization from bytes processed to a flat-rate pricing modelCustomers on the flat-rate pricing model purchase compute capacity, measured in slots, and can run any number of queries using this capacity. The flat-rate pricing model is a great alternative to the bytes processed pricing model, as it gives you more cost predictability and control. Think of slots as compute nodes—the more slots you have, the more horsepower you have for your queries.Getting started with ReservationsGetting going with BigQuery Reservations is very easy and low-risk. We introduced Flex Slots, which are charged per second and can be canceled after only 60 seconds, so you can run an experiment for the price of a cup of coffee! Here’s how to get started:1. Simply go into the BigQuery UI and click on “Reservations.” From there choose “Buy Slots.”2. In the purchase flow, choose “Flex Slots” as your commitment type and “500” as your size. If you’ve never bought slots before, you’ll be prompted to default your organization to flat-rate. Opt in if you want all your projects to start using your purchased slots automatically. 3. Confirm your purchase. In a few seconds, your capacity should be confirmed and deployed. 4. Go into the “Assignments” tab and assign any of your projects, or even your entire organization, to the “default” reservation. This tells BigQuery that those projects are on the slots pricing model, rather than bytes processed. Voila!Once you’re done with your test, simply delete all assignments and commitments. A 15-minute test will cost you just $5. Using BigQuery ReservationsOnce you set up Reservations, BigQuery automatically makes sure that your usage is efficient. Any provisioned slot that’s idle across your organization is available elsewhere in your organization to be used. That’s right, any idle or slack capacity is always available for you to use. This means that no matter how big or small your organization is, you get economy of scale benefits, without the penalty of creating wasteful compute silos.To increase capacity, all you need to do is buy more slots. Once your purchase is confirmed and slots are deployed, BigQuery automatically starts using this additional capacity for all your queries in flight—there’s no pausing work or waiting for new queries to start. It all happens quickly and seamlessly.Likewise, to decrease capacity, simply cancel an existing slot commitment. If you were using that capacity, BigQuery will simply pause those bits of work—your queries won’t fail, and at worst they’ll just slow down.Head over to documentation on slots to learn more about what BigQuery slots are and how they are distributed to do work. Using Reservations for workload managementBigQuery Reservations is built for simplicity, first and foremost. That said, it’s a highly configurable platform that helps complex organizations manage their entire BigQuery operations in one place.It’s typical for an organization administrator to want to isolate and compartmentalize their departments or workloads. For example, you may have a “business” department, an “IT” department, and a “marketing” department, and you’d like each department to have their own set of BigQuery resources, like this:In the above example, you could set up your Reservations as follows:You purchase a 1000-slot commitment. This is your organization’s total processing capacity.You earmark 500 slots for “business,” 300 slots for “IT,” and 200 slots for “marketing” by creating a reservation for each.You assign Google Cloud folder “business_folder” to “business” reservation, and any other Google Cloud project that the business department is using.You assign Google Cloud folder “IT” to “IT” reservation, and project “it_project”You assign the Google Cloud project used by the marketing team for Looker dashboards to “dashboard_proj” We mentioned earlier that idle capacity is seamlessly shared across your organization. In the above example, if at this moment “business” reservation has 20 idle slots, they are automatically available to “IT” and “marketing.” As soon as “business” reservation wants them back, they’re pre-empted from “IT” and “marketing.” Pre-emption is graceful—queries slow down and accelerate seamlessly, rather than error out. Reservations also enables you to centrally manage your entire organization, mitigating the risk of “shadow IT” and unbounded spend. Only folks with bigquery.resourceAdmin, bigquery.admin, or owner roles set at the org level can dictate which projects and folders are assigned to which reservations. Cost attribution back to each department may be important to you. Simply query INFORMATION_SCHEMA jobs tables for reservation_id field and aggregate over slots consumed to report on what portion of the total bill is attributable to each team. To make this even easier, in the coming weeks you’ll see project-level cost attribution in the Google Cloud billing console. When to use Reservations Let’s unpack some examples of how you could set up Reservations for specific use cases.If you have a dev, test, or QA workload, you may only want it to have access to a small amount of resources, and you may not want it to leverage any idle capacity. In this instance, you could create a reservation “dev” with 50 slots and set ignore_idle_slots to true. This way this reservation will not use any idle capacity in the system beyond the 50 slots it requires.If you have a batch processing workload, and you’d like it to only run when there’s slack in the overall system, you can create a reservation “batch” with 0 slots. Any query in this reservation will sit queued up waiting for slack capacity, and will only make forward progress if there’s slack capacity.Suppose you have a reservation that is used to generate Looker dashboards, and you know that every Monday between 9 and 11 in the morning this dashboard experiences higher than normal demand. You may set up a scheduled job (via cron or any other scheduling tool) to increase the size of this reservation at 9am, and reduce it back at 11am.Using Google Cloud folders for advanced configurationGoogle Cloud supports organizations and folders, a powerful way to map your organization to Google Cloud Identity and Access Management (Cloud IAM). Child folders acquire properties of their parent folders, unless explicitly specified otherwise, and users with access to parent folders automatically acquire access to all child folders and their resources.BigQuery Reservations can be used in conjunction with folders to manage complex organizations.Consider the above scenario:Folder C is set up for a specific department in the organization.Org admin has IAM credentials to the entire organization.Folder admin has IAM credentials to Folder C (and hence Folder E as well).Folder admin wants to control her department’s BigQuery costs and resources autonomouslyOrg admin is the central IT department that oversees security and budget conformism.Folder D represents another department in the organization, managed by org admin.To configure BigQuery for this organization, do the following:Folder admin sets up BigQuery Reservations in Folder CFolder admin assigns Folder C and any projects she owns to her reservationsOrg admin sets up BigQuery Reservations in a project in Folder D, and in a project tied to the organizationOrg admin assigns Folder D and any projects he owns to his reservations in Folder DOrg admin assigns the entire organization to the reservations at org levelWith the above setup, folder admin is able to self-manage BigQuery for Folder C and Folder E, and org admin is able to manage BigQuery for every folder in their organization, including Folder C and Folder D. The only caveat is that in this configuration, idle slots are not shared between reservations in Folder C, Folder D, and the organization node.With BigQuery Reservations, managing your BigQuery costs and your workloads is easy. And BigQuery Reservations offers the power and flexibility to meet the goals of the most complex organizations out there while maximizing efficiency and minimizing waste. To learn more about BigQuery Reservations, head over to the documentation.
Quelle: Google Cloud Platform
Published by