You can write job results to your Salesforce.com organization.
For sample workflows of SFDC, view Treasure Boxes.
Prerequisites
- Basic knowledge of Arm Treasure Data, including the toolbelt
- Salesforce.com organization and username, password, and security token for API integration
- User has “API Enabled” permission
- Target Salesforce.com Object should exist with read/write permissions for the User
Architecture
A front-end application streams data to be collected in Treasure Data via Log/data collector daemon or Mobile SDKs. You can also bulk import your data using Bulk Import from the CLI. A scheduled query is setup in Treasure Data to run periodically on the data and write the result of each query execution into your Salesforce.com Object.

The above is a fairly common architecture; for example:
Example 1: Ranking: What are the “Top N of X?”
Every social/mobile application calculates the “top N of X” (ex: top 5 movies watched today). Treasure Data already handles the raw data warehousing; the “write-to-Salesforce.com” feature enables Treasure Data to find the “top N” data as well.
Example 2: Dashboard Application
If you’re a data scientist, you need to keep track of a range of metrics every hour/day/month and make them accessible via visualizations. Using this “write-to-Salesforce.com” feature, you can streamline the process and focus on building visualizations of your query results via Reports and Dashboards on the Salesforce.com organization.
Result Output URL Format
Format
The result output target is represented by a URL with the following format:
sfdc://<username>:<password><security_token>@<hostname>/<object_name>
where:
- sfdc is an identifier for result output to Salesforce.com;
- username and password are the credential to your Salesforce.com organization;
- security_token is the additional credential for API access;
- hostname is the host name of the Salesforce.com organization. Usually this is ‘login.salesforce.com’ for production environments and ‘test.salesforce.com’ for sandbox environments. In case where you configure custom domain for your organization specify the hostname you’re using for login;
- object_name is the target Salesforce.com Object API name (e.g. ResultOutput__c). The Object and columns for data integration must be defined beforehand;
For example with:
- username: user@treasure-data.com
- password: PASSWORD
- security_token: 7SMvicR9ojdPz0XLtlWi3Rtw
The URL will look like:
sfdc://user%40treasure-data.com:PASSWORD7SMvicR9ojdPz0XLtlWi3Rtw@login.salesforce.com/Account
Make sure that you escape the '@' in the username with '%40'. |
Options
Result output to Salesforce.com supports various options that can be specified as optional URL parameters. The options are compatible with each other and can be combined.
Where applicable, the default behavior is indicated.
Update mode option
Controls the various ways of modifying the database data.
- Append
- Truncate
- Update
mode=append (default)
The append mode is the default that is used when no mode option is provided in the URL. In this mode, the query results are appended to the object.
Because mode=append is the default behavior, these two URLs are equivalent:
sfdc://.../Contact sfdc://.../Contact?mode=append
mode=truncate
With the truncate mode the system first truncates the existing records in the Salesforce.com Object and moves them into the Trashbin, then inserts the query results.
Example:
sfdc://.../CustomObject__c?mode=truncate
You can specify the hard_delete=true option for mode=truncate to delete records instead of moving it to the Trashbin. To use this option, the user must have the 'Bulk API Hard Delete' permission. |
sfdc://.../CustomObject__c?mode=truncate&hard_delete=true
mode=update
With the update mode, a row is inserted unless it would cause a duplicate value in the external key columns specified in the “unique” parameter. In such a case, an update is performed instead. The “unique” parameter is required with this mode and must be defined as an external key when used with the update mode.
Example:
sfdc://.../Contact?mode=update&unique=CustomerId__c
The default behavior for the ‘update’ mode is actually ‘upsert’. If you do not want to “upsert” but only “update”, you can add the upsert=false option. Then it updates existing records based on “unique” parameter match and does not insert new records.
sfdc://.../Contact?mode=update&unique=CustomerId__c&upsert=false
Upload concurrency_mode option
The concurrency_mode option controls how the data is uploaded to the Salesforce.com organization. The default mode is parallel; it is the recommended method for most situations.
concurrency_mode=parallel (default)
With the parallel method, data is uploaded in parallel. This is the most reliable and effective method and it is recommended for most situations.
Because concurrency_mode=parallel is the default behavior, these two URLs are equivalent:
sfdc://.../CustomObject__c sfdc://.../CustomObject__c?concurrency_mode=parallel
concurrency_mode=serial
Uploading records in parallel is recommended. However, if you see “UNABLE_TO_LOCK_ROW” in an error message, try concurrency_mode=serial instead.
sfdc://.../CustomObject__c?concurrency_mode=serial
Updating A Salesforce.com Object acquires a lock for the Object and parent Object referenced by columns. If you upload objects in parallel and multiple objects have reference to the same parent object, Salesforce.com is not able to acquire the lock for insert/update and returns an ‘UNABLE_TO_LOCK_ROW’ error. In such cases, specify the concurrency_mode=serial option.
Authentication session_id option
If you have a Salesforce.com Session ID, you can authenticate with the session_id option instead of username, password, and security token (that is, username, password, and security token can be omitted from the URL).
sfdc://login.salesforce.com/Contact?session_id=3deT2aQjYQbIRN0M...jB1tHBb7UW0K!M
Retry option
This options sets the number of attempts the Treasure Data export worker makes to write the result to the configured Salesforce.com destination, if errors occur. If the export fails more than the set number of retries, the query fails.
The default number of retries is retry=2 but one can virtually set it to any number. The number of retries affect the overall duration of a query.
sfdc://.../CustomObject__c?retry=5
Split Records options
The Treasure Data result export splits the records in the result of a query in chunks of 10000 records by default and bulk upload one chunk at a time. The split_records option configures the size of this chunk, if required.
sfdc://.../CustomObject__c?split_records=100
Usage
CLI
To output the result of a single query to Salesforce.com organization add the -r / —result option to the td query command. After the job is finished, the results are written into your Salesforce.com organization Object:
$ td query -w -d testdb \ --result 'sfdc://login.salesforce.com/CustomObject__c?session_id=.....' \ "SELECT code as Code__c, COUNT(1) as Count__c FROM www_access GROUP BY code"
To create a scheduled query whose output is systematically written to Salesforce.com organization, add the -r / —result option when creating the schedule through the td sched:create command:
$ td sched:create hourly_count_example "0 * * * *" -d testdb \ --result 'sfdc://user%40treasure-data.com:PASSWORDsecuritytoken@login.salesforce.com/CustomObject__c' \ "SELECT COUNT(*) as Count__c FROM www_access"
Using the TD Console
In the Queries
tab, click New Query.
Select a database and table from the popup menus. Enter your query and select Output results.
To avoid any issues with result export, define column aliases in your query such that resulting column names from the query match the Salesforce field names for default fields and API names (usually ending with __c) for custom fields. |
To export the result of a query or schedule to Salesforce.com, specify Result Export information for a new query or an existing job/query:
Troubleshooting
If you get the following type of error due to the Result Output to SFDC job, you can see the error of the job on SFDC. As shown in the following example, check “XXXXXXXXXXX” on SFDC. You can figure out a detail of the error.
```
17/05/01 03:35:05 INFO sfdc.BulkAPIJob: Job XXXXXXXXXXX finished: Total 1, Completed 0, Failed 1 17/05/01 03:35:05 INFO sfdc.BulkAPIClient: Batch jobs failed (1/1)
```
Ref. View Bulk Data Load Job Details.
Comments
0 comments
Please sign in to leave a comment.