Configure Hive Metastore from derby to MySQL

adsensecode1

1. Objective

This tutorial describes how to change the metastore layer of Hive from Derby to MySQL. Hive stores its metadata (schema related information, partitioning information, etc.) into database, Hive is shipped with Derby database. Derby is an embedded database backed by local disk. Derby is single threaded database which doesn’t allow multiple connections, it is not production ready. In this tutorial we will change the metastore of Hive to MySQL

2. Apache Hive

Hive is a data-warehouse on the top of Hadoop. Using Hive we can run ad-hoc queries for the analysis on data. Hive saves us from writing complex Map-Reduce jobs, rather than that we can submit merely SQL queries. Hive converts SQL queries into MapReduce job and submit the same to the cluster.


3. Steps to configure Metastore

3.1. Step-1: Install MySQL

1
$sudo apt-get install mysql-server

3.2. Step-2: Copy MySQL connector to lib directory

Download MySQL connector (mysql-connector-java-5.1.35-bin.jar) and copy it into the $HIVE_HOME/lib directory
Note: $HIVE_HOME refers hive installation directory

3.3. Step-3: Edit / Create configuration file hive-site.xml

Add following entries in the hive-site.xml (present in $HIVE_HOME/conf)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost/hcatalog?createDatabaseIfNotExist=true</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>your_username</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>your_password</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
Now start hive terminal, it will be connected to MySQL. Now you can open multiple hive connections, which was not possible with Derby database.

adsensecode2