Presto Resource Pools enable you to break up your available compute resources into manageable chunks. You can then organize their usage across project, groups, or use cases.
Resource Pools are helpful for the following challenges:
-
You have a team of analysts who often see significant queuing during the work day due to large scheduled queries that take, for a long period of time, the full account’s resources.
-
You have critical SLA queries that always have to have resources available to run against, and you want to ensure that other queries (scheduled, or ad hoc) don’t get in the way.
Resource pools are only available for accounts with 5 or greater Presto Compute Units allocated. |
Setup
This feature is enabled upon request. Contact support or your primary account representative if you want to use Presto Resource Pools.
Resource Pool Functionality
Based on the number of Presto Compute Units your account has, you are able to use up to a specified maximum of the following:
- Concurrent Query Limit
- Memory Limits (per query)
- Split Compute Limits
Resource Pools allow you to allocate the resources, by the percentage of your account’s total available amount.
Your resource pools can either be strictly partitioned, or can overlap the allocation of your total available resources.
For example:
-
You might enable a “scheduled” pool with access with up to 70% of your account’s total resources, and an “ad hoc” pool with access up to 70% of the account’s resources. In this way, 40% of the account’s resources are shared, and 30% of the account’s resources are dedicated for either ad hoc or scheduled queries.
-
You might want to set up your pools with 40% to a “scheduled” pool for a stricter SLA environment, and 60% to your “ad hoc” pool for development purposes.
For the first example given, queries are prioritized between the pools based on the following logic:
-
Highest priority queued queries are issued first
-
First come, first served
Resource Limits Applied to Pools
Resource Pools divvy up resources in an account as a percentage of their account total. These are based on:
- Max-Anytime Splits
- Query & Account Max Memory
- Concurrent Queries (total allowed * Pool %, rounded up)
Options for Presto Resource Pool Allocation
Customers can choose up to a maximum of 3 total resource pools, allocated with percentages of their choosing. Typically customers will choose 1 of 2 typical configurations.
-
A complete separation of resource usage
In this configuration, customers choose to directly allocate resources to each pool – so that there are no shared resources between them.
Example: 30%, 70% resource split
A split is useful for:
- SLA critical, such as for separating “development” and “production” environments.
- When dividing up resources between multiple teams
-
A partial shared environment
In this configuration, customers choose to have some overlap between multiple pools. Some resources are shared between pools, while some resources are saved for each pool.
Example: 70%, 70% resource split.
A shared split is useful for:
- Analyst teams that want to maximize available resources for scheduled queries – especially during non-work hours, but want to keep some part of resources always available for their ad hoc work.
Selecting Which Query Pool Your Query Will Run On
By default, resource pools can be enabled for scheduled saved queries and ad hoc queries. If you use the default configuration, you do not need to use the following methods to select query pools. It is necessary to set your resource pool if you use a custom setup.
Command Line Option
If you are using the CLI to issue queries, you can set up additional pools with custom names, as follows:
td query -database <database_name> --pool-name <resource_pool_name>
TD Console Option
Using the TD Console, you can select a specific resource pool for use by adding the following comment at the top of your query:
-- set property resource_group = '<resource_pool_name>'
Frequently Asked Questions
What about resource pools for Hive?
Resource pools in Hive are available. Read more here.
What is a Split Compute Unit?
Split Compute Units are Presto’s way of allocating a set amount of machine resources to a computation task. The amount of splits available to your account is proportional to the total number of machine processing resources being made available.
The amount of total Splits a query requires to run is in proportion to the amount of data scanned, and the complexity of the query.
My account limits Concurrent Queries (CQ). How do CQs get allocated across resource pools?
The allocation of concurrent queries is based on specified resource pool percentages. The following examples show how concurrent queries are allocated across resources pools with various percentages. In these examples, the account has an overall CQ of 8:
- If you specify an allocation as 70% ad hoc and 70% scheduled allocation, then the CQ for each pool is 6 (8 CQ * 0.70 = 5.6, rounded up to 6).
- If you specify an allocation as 60% ad hoc, 40% scheduled, the CQ on the ad hoc pool is 5 (8 CQ * 0.6 = 4.8, rounded up to 5) and the CQ on the scheduled pool is 4 (8 CQ * 0.4 = 3.2, rounded up to 4).
The CQ for the overall account still applies. So in the second example, if you use all 5 queries allowed in the ad hoc pool, the scheduled pool is limited to 3 queries until one of the ad hoc queries finishes. The account, in this example, is specified for only 8 concurrent queries.
Comments
0 comments
Please sign in to leave a comment.