Hive 0.13 Syntax to Update for Hive 2
Here is a summary of the Hive 0.13 constructs that must be updated for Hive 2 compatibility.
- v column handling
- Expanded list of reserved words in Hive 2
- Numeric literals must be CAST to a number data type before storage in a TD table
- Numeric literals in an Array must be CAST to a consistent data type
- CROSS JOIN requires a query hint to use
v column is not supported in Hive 2.
The following changes are required for any query using v column syntax:
v column values are always returned as STRING type.
In Hive 2, rather than use the v column syntax, reference the actual column and explicitly CAST it to the necessary data type. For example:
v['country_code'] --> cast(country_code as STRING)
If you use v column in Hive 2, you’ll see the following error message:
Invalid table alias or column reference 'v'
Columns selected through v column where names start with underscore need to be escaped with back quotes (`)
For example, change your syntax to something similar to:
v['_country_code'] -> CAST(`_country_code` AS STRING)
Reserved words and UDF names used as column names must be escaped with backticks. The most common case of this is `date`. For example:
date -> `date`
Example parse errors given by Hive 2:
cannot recognize input near 'date' '>=' '1548403200' in expression specification
Numeric literals that are not integers, default to decimal. However, TD storage only supports DOUBLE. You must CAST any numeric value to DOUBLE before storing in a table.
51.51202 -> CAST(51.51202 AS DOUBLE)
Without the casting, you may see an error similar to this:
FAILED: SemanticExceptionjava.lang.IllegalArgumentException: Failed to cast hive type : decimal(6,4) (field : col1dc) to TD type, only following hive types can be casted to TD type : tinyint, smallint, int, bigint, float, double, string, binary, array, map, struct, uniontype
All numeric literals in an array must be CAST to a consistent data type. For storage into a table CAST all values to a DOUBLE data type, but, for other purposes such as computation, CAST them to data types recognized within Hive. For example, DECIMAL.
Array(0, 0.0, 1.0)
Error you may see an error similar to this:
FAILED: SemanticException [Error 10016]: Line 10:0 Argument type mismatch '0.980324074': Argument type "array<decimal(19,9)>" is different from preceding arguments. Previous type was "array<decimal(10,0)>"
Hive 2 restricts the use of CROSS JOINs because they are often coding errors and are expensive to execute. If you intend to perform a cross join, apply the following query hint to bypass this protection:
-- @TD enable_cartesian_product: true
Explicit cross join example:
SELECT * FROM employee CROSSJOIN department
Implicit cross join example:
SELECT * FROM employee, department
If you run a query with an explicit or implicit CROSS JOIN and no special “-- @TD enable_cartesian_product: true” hint to allow it to run, you’ll see an error like this:
FAILED: SemanticException Cartesian products are disabled for safety reasons. If you know what you are doing, please sethive.strict.checks.cartesian.producttofalseand that hive.mapred.modeisnotsetto'strict'to proceed. Note that if you may geterrorsor incorrect results if you make a mistake whileusingsomeof the unsafe features.