Arm Treasure Data supports Presto as a low-latency query engine.
What is Presto?
Presto is an open-source parallel SQL execution engine. Unlike Hive, Presto doesn’t use the map reduce framework for its execution. Instead, Presto directly accesses the data through a specialized distributed query engine that is very similar to those found in commercial parallel RDBMSs.
Treasure Data has customized Presto to talk directly with our distributed columnar storage layer. As a result, the end user experience is nearly identical to querying Hive.
Does Presto Replace Hive?
No. Hive is designed for batch processing, while Presto is designed for short interactive queries useful for data exploration.
Presto currently has limited fault tolerance capabilities when querying. If a process fails while processing, the whole query must be re-run. On the other hand, it executes queries 10-30x faster than Hive. Thus, even if there is a process failure and a query must be restarted, the total runtime will often still beat Hive’s significantly.
Another caveat is that Presto has an in-memory only architecture. So if there is a particularly large data set which exceeds the total memory capacity available to Presto, query execution will fail.
Even with Presto as part of our ecosystem, MapReduce and Hive will continue to have many viable use cases (for example: long-running data transformation workloads).
How to Use Presto?
Select “Presto” as the query type when using the web console’s query editor.
Presto JDBC/ODBC Driver
We also provide Presto JDBC/ODBC gateway. You can issue the query to Treasure Data by using Presto JDBC/ODBC drivers.
Using the CLI, specify
-T presto in the
td query command. A v0.10.99 or newer client is required.
$ td query -w -T presto -d testdb \ "SELECT code, COUNT(1) FROM www_access GROUP BY code"
For REST API, the endpoint is
Presto Example Query Catalog
If you’re looking for dozens of Presto SQL templates, visit Treasure Data’s example query catalog page.
Presto Query Language Reference
Presto supports industry-standard SQL-92 syntax.
|Current Presto version is v0.188.|
Presto DELETE Statement
- Use of DELETE in Presto, available as Private Alpha
Querying DataTank from Presto