Hive CLI Connection¶
The Hive CLI connection type enables the Hive CLI Integrations.
Authenticating to Hive CLI¶
There are two ways to connect to Hive using Airflow.
Use the Hive Beeline. i.e. make a JDBC connection string with host, port, and schema. Optionally you can connect with a proxy user, and specify a login and password.
Use the Hive CLI. i.e. specify Hive CLI params in the extras field.
Only one authorization method can be used at a time. If you need to manage multiple credentials or keys then you should configure multiple connections.
Default Connection IDs¶
All hooks and operators related to Hive_CLI use hive_cli_default
by default.
Configuring the Connection¶
- Login (optional)
Specify your username for a proxy user or for the Beeline CLI.
- Password (optional)
Specify your Beeline CLI password.
- Host (optional)
Specify your JDBC Hive host that is used for Hive Beeline.
- Port (optional)
Specify your JDBC Hive port that is used for Hive Beeline.
- Schema (optional)
Specify your JDBC Hive database that you want to connect to with Beeline or specify a schema for an HQL statement to run with the Hive CLI.
- Extra (optional)
Specify the extra parameters (as json dictionary) that can be used in Hive CLI connection. The following parameters are all optional:
use_beeline
Specify asTrue
if using the Beeline CLI. Default isFalse
.proxy_user
Specify a proxy user as anowner
orlogin
or keep blank if using a custom proxy user.principal
Specify the JDBC Hive principal to be used with Hive Beeline.
When specifying the connection in environment variable you should specify it using URI syntax.
Note that all components of the URI should be URL-encoded.
For example:
export AIRFLOW_CONN_HIVE_CLI_DEFAULT='hive-cli://beeline-username:beeline-password@jdbc-hive-host:80/hive-database?hive_cli_params=params&use_beeline=True&auth=noSasl&principal=hive%2F_HOST%40EXAMPLE.COM'