You can use the Treasure Data Connector to write job results to a Arm Treasure Data database. Usually, this connector is used to export data from one Treasure Data account to another Treasure Data account.
|We strongly recommend that you not use this connector to write data from one table in your account to another table in your account.
Instead, use "INSERT INTO" query or "INSERT OVERWRITE/CREATE TABLE AS SELECT" query for the best performance.
Presto: "CREATE TABLE AS SELECT" or "INSERT INTO"
Hive: "INSERT INTO/OVERWRITE"
- Basic knowledge of Treasure Data, including the toolbelt.
You can configure a connection from the Treasure Data Console or from the command line interface.
How to complete an account-to-account transfer using the Treasure Data Console
Begin from the target Treasure Data account. You must copy the API Key information from the account profile.
Retrieve the API Key information
From the console, click the Profile icon of the administrator user.
Go to the Credentials tab.
Copy the Write-only key.
|If your data transfer involves updating the existing data, e.g. with mode=replace, you'll need to use the master key instead of the write only key|
If Master key and Write-only key are not listed, click Create to verify the account. When verification completes and you return to the profile, and you see both a master key and write-only key. Copy the write-only key.
Create the Connection
Go to Connections –> Sources Catalog and click the Treasure Data tile.
Enter the write-only API Key from the target account.
In API Hostname enter the server to use for export. You can leave the field blank to use the default server for your region. Alternatively, you can enter one of the following values:
- AWS US: api.treasuredata.com
- AWS JP: api.treasuredata.co.jp
- IDCF: api.ybi.idcfcloud.net
|If you are writing data from one region to another, e.g. form AWS US to AWS JP the API Hostname field is required.|
Give your connection a descriptive name and select Create Connection.
For On-demand Jobs from the CLI
For on-demand jobs, just add the
--result option to the
td query command. When the job is finished, the results are written into
output_table within the
$ td query --result 'td://@/testdb/output_table' \ -w -d testdb \ "SELECT code, COUNT(1) FROM www_access GROUP BY code"
For Scheduled Jobs
For scheduled jobs, just add the
--result option when scheduling a job. Every time the job runs, the results are written into
$ td result:create mydb td://@/testdb $ td sched:create hourly_count_example "0 * * * *" \ -d testdb "select count(*) from www_access" \ --result mydb:output_table
Two Ways to Modify Data (Append/Replace)
You can add or delete data in two ways.
td://@/testdb/output_table # append td://@/testdb/output_table?mode=append # append td://@/testdb/output_table?mode=replace # replace
This is the default mode. The query results are appended to the table. If the table does not exist yet, a new table will be created. This operation is not idempotent; you cannot make that same call repeatedly and produce the same result.
If the table already exists, the rows of the existing table are replaced with the query results. If the table does not exist yet, a new table will be created.
An atomic transaction is an indivisible and irreducible series of database operations such that either all occur, or nothing occurs. In atomicity, a guarantee of atomicity prevents updates to the database occurring only partially.
An example of an atomic transaction is a monetary transfer from bank account A to account B. Performing these operations in an atomic transaction ensures that the database remains in a consistent state.
We achieve atomicity (so that a consumer of the table always has consistent data) by performing the following three steps in a single transaction.
- Create a temporary table.
- Write to the temporary table.
- Replace the existing table with the temporary table atomically.