Arm Treasure Data uses the same conventions as Relational Database Management Systems (RDBMSs) for managing data sets: Databases and Tables. Each database can contain several tables. However, each table can belong to only one database.
- Basic knowledge of Treasure Data
Database and Table – explained
A Database is a collection of tables, which is similar to other database systems. Tables contain your actual data. A given table can belong to only a single database.
Management from TD Console
Databases page allows you to navigate/create/delete databases and tables from the browser. This is the preferred way to interact with databases and tables.
Management from Command Line Interface (CLI)
By using the TD toolbelt, you can manage databases and tables from the command line.
Creating a Database
The ‘db:create’ sub-command creates a database. For example, the following command creates a table named ‘test_db’ in the cloud.
$ td db:create test_db
|The database name supports only lowercase alphanumeric characters and underscores. Hyphens cannot be used (an exception will be thrown). Also, the database name must be 3 to 128 characters long.|
Your local environment remains unchanged; the database resides solely in the cloud.
Showing a List of Your Databases
The ‘dbs’ sub-command lists all the databases in your account.
$ td dbs +---------------+ | Name | +---------------+ | twitter | | follow_events | | test_db | +---------------+ 3 rows in set
Removing a Database
The ‘db:delete’ sub-command removes a database. This sub-command removes all tables and all data stored within the database as well.
$ td db:delete test_db Database 'test_db' is deleted.
|This operation cannot be rolled back. The data will be unrecoverable.|
Creating a Table
The ‘table:create’ sub-command creates a table. For example, the following command creates a table named ‘test_table’ within the database ‘test_db’.
$ td table:create test_db test_table Table 'test_db.test_table' is created.
|The table name supports only lowercase alphanumeric characters and underscores. Hyphens cannot be used (an exception will be thrown). Also, the table name must be 3 to 128 characters long.|
Much like your database, this table exists only in the cloud. Your local environment remains unchanged.
Showing a List of Your Tables
The ‘tables’ sub-command lists all the tables within the specified database. The number of records in each table is shown as well (currently zero).
$ td tables test_db +----------+------------+------+-------+ | Database | Table | Type | Count | +----------+------------+------+-------+ | test_db | test_table | log | 0 | +----------+------------+------+-------+ 1 row in set
Issuing the ‘tables’ command without an argument lists all the databases and tables that you have created.
$ td tables +----------+-------------+------+-------------+ | Database | Table | Type | Count | +----------+-------------+------+-------------+ | kzk | www_access | log | 14 | | sf | ab | log | 212900645 | | sf | msgpack_www | log | 390734 | | sf | wikistats3 | log | 12745020396 | | sf | x | log | 3 | | test_db | test_table | log | 0 | +----------+-------------+------+-------------+ 8 rows in set
The ‘table:delete’ sub-command removes a table from a database. This sub-command removes all data stored within the table as well.
$ td table:delete test_db test_table Table 'test_db.test_table' is deleted.
|This operation cannot be rolled back. The data is unrecoverable.|
Expiring Data From A Table
Data expiration is a mechanism that allows you to set an expiration period (in days) for the data in a table that receives streaming ingest. When an expiration period is set for a specific table, Treasure Data drops all data partitions in that table whose timestamp is older than the amount of days specified in the expiration period.
Assuming relatively predictable data ingest volumes per day, planned data expiration is a useful strategy for managing the amount of storage consumed by a table.
Data expiration is primarily designed for use with tables that are receiving streaming ingest. When used with other tables, it is possible that partitions containing data close to the expiration time may not be dropped immediately. To ensure query results only contain rows from within a fixed time window, include an explicit filter based on time ranges using TD_INTERVAL() or TD_TIME_RANGE().
To set the expiration period for a table, you can use:
- TD Toolbelt commands
- TD Console
Using the TD Toolbelt from the command line to enable and set data expiration
To enable and set the expiration use td table:expire. For example:
$ td table:expire test_db test_table <expiration_days>
where <expiration_days> is an integer number greater than 0 indicating the maximum age of any record in the table.
If the command is successful you will see a message similar to the following:
Table set to expire data older than 30 days.
To disable the table data expiration set the expiration days to 0:
$ td table:expire test_db test_table 0
When the disable succeeds you will see a message similar to:
Data expiration disabled for this table.
Using the Console to enable and set data expiration
Access the database page.
Access the schema or dataset page.
Access the table for which you want to set data expiration.
Select the ... button and Table Settings.
Select Expire data?
Select or type the number of days after which you want data to be deleted from this table.
Table Delete Protection
You can prevent deletion of tables that contain critical data that your company requires to be maintained over time. The strict implementation of this feature prevents any user from running DELETE queries, deleting a table outright, or replacing the data within a table in any capacity.
You can enable this feature on a per table basis by submitting a support request. Contact support or your account representatives with a list of tables you’d like to protect.