Introduction
- Data ware housing tool on top of Hadoop
- SQL like interface
- Provides SQL like language to analyze the data stored on HDFS
- Can be used by people who know SQL
- Not all traditional SQL capabilities are supported
- Under the hood hive queries are executed as MapReduce jobs
- No extra work is required
Hive Components
- MetaStore
- Its a database consisting of table definitions and other metadata
- By default stored on the local machine on derby database
- It can be kept on some shared machine like relational data base if multiple users are using
Query Engine
- Hive-QL which gives SQL like query
- Internally Hive queries are run as map reduce job
Hive Data Models
- Hive forns or layers table definitions on top of data residing on HDFS
Databases
- Name space that separates tables from other units from naming confliction
Table
- Homogenous unit of data having same schema
MetaStore
Its a data base consisting of table definations and other metadata. By default stored on the local machine on derby database. It can be kept on some shared machine like relational data base if multiple users are using.
Before you start installing hive, you should have already installed hadoop.
Step 1: Download Hive from apache website (check with comparability with hadoop version)
Step 2: Go to downloads and extract hive.tar.gz
Step 3: Copy the extracted jar into /home/hduser/hive.
Step 4: Edit /etc/bash.bashrc
$gedit bash.bashrc
$sudo gedit /etc/bash.bashrc
$sudo leafpad /etc/bash.bashrc
Step 5: Set Home Path
export HADOOP_HOME=/home/hduser/hadoop
export HIVE_HOME=/home/hduser/hive
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HIVE_HOME/bin
Step 6: Save and Close
Step 7: Open terminal and enter hive
Now you can enter hive shell.
No comments:
Post a Comment