The Presto Utilization dashboard show you your recent and historical use of Presto resources. You can use the dashboard to determine if you risk exceeding your monthly plan limits.
|The Presto Utilization page is available upon request for most customers. However, availability for customers in Japan is still pending and will occur soon.|
From the console, Arm Treasure Data owners and administrators, click Admin > Presto Utilization to access the dashboard.
Presto Terms and Concepts
Understanding Presto concepts and terminology is useful when assessing metrics that are presented in the dashboard.
The following terms are used in more than one metric:
- Units of Presto query processing on data sets, including scanning data from tables and processing intermediate data during query processing. For example, reading and filtering data uses a certain number of splits; joining, sorting and aggregating the data in that table uses additional splits; etc.. In general, processing more input data or performing more complex operations uses more splits.
- Anytime Splits
- The total number of splits that can be processed simultaneously at any moment, given your current subscription plan. This is a function of your current assigned Presto Units and your subscription plan.
- Splits Hours
- Refers to the time required to process splits.
- Presto Units
- A unit of pricing referenced in our subscription plans. Adding Presto units increases the number of available anytime splits, the total number of available split-hours, and the amount of memory available for processing each query.
- Presto Query Signatures
- For queries that don’t have a name, the query signature is a concise representation that summarizes the operations performed by the query. The notation used for query signatures is described in Presto Query Signatures.
The dashboard includes the following charts:
- Your Current Presto Units
- The total number of presto units assigned to your account, based on your minimum price plan.
- Max Anytime Split Limit
- The total number of split resources you can use at a given moment. This is a function of your current assigned Presto Units and your subscription plan.
- Monthly Split Hour Limit
- The total number of split-hours that your account can use within one month, based on your current subscription plan. This is a function of your current assigned Presto Units and your subscription plan.
- Max Query Memory Limit
- The maximum total memory that can be used on a per query basis, based on your current subscription plan.
- Additional Presto Units Needed
- The number of additional presto units needed to support either the number of split hours you’re using beyond your limit, or the total memory/query you need for your top memory consuming queries.
- This Month’s and Last Month’s Split Hours
- Gray Line - Your cumulative split-hours usage within the month.
Blue Line - Your daily split-hour usage within the month.
Green Line - Your account’s maximum allowed split-hour usage, based on your contract.
- Monthly Total Presto Jobs Ran
- Number of Presto Jobs run, by month, over the past 12 months.
- Total Presto Hours
- Number of Presto Split-Hours consumed, by month, over the past 12 months.
- Monthly Over Memory Jobs
- Number of jobs that have exceeded your account’s Max Memory Limit.
Presto Query Details
The Presto Query Details table, at the bottom of the dashboard, shows a per-query usage data. Use the table to identify your top resource-consuming queries, in terms of memory or processing. Queries are either referenced by name, or by query signature.
After you identify resource-intensive queries, you can work to optimize query resource consumption:
- Process less data, filtering unnecessary rows and omitting unnecessary columns
- Minimize compute-heavy operations like sorting
- Check the frequency of scheduled jobs, converting some Presto jobs to Hive, using Data Tank in certain instances. See the Presto Performance Tuning.
The Presto Query Details table, includes the following columns:
- Month and Year that the query ran.
- Query Name or Signature
- If the query is a saved query, this field shows the query name. Otherwise, it shows the query signature.
- Last Month Total Split Hours
- Total number of split-hours this query consumed last month.
- Current Month Split Hours
- Total number of split-hours this query consumed up to the present day this month.
- Current Month Total Executions
- Total number of times that this query or query signature ran up to the present day this month.
- Average Split Hours Per Run
- Average number of split-hours consumed, per run of that query.
- Median Query Duration in Minutes
- Median duration, in minutes, the query takes to complete.
- Num of Jobs over Allotted Memory
- Number of times this query hit the account’s maximum per-query memory limit.
- Max Memory Used in MB
- This field shows how much memory a query needs to succeed. It is only shown for queries that are exceeding the account’s per-query max memory limit.
The memory requirement is determined by re-running queries that fail due to memory limits. The queries are rerun in a pool that gives the queries much higher memory limits.
- Last Job ID
- The most recent Treasure Data Job ID run from this query. This is helpful if you want to find out more about the query in question, and the query is from an external system. There is no Treasure Data Saved Query ID but you can use the most recent (last executed) Job ID.