Hadoop2.3、 Hbase0.98、 Hive0.13架構中Hive的安裝部署配置以及數據測試

jopen 9年前發布 | 23K 次閱讀 Hadoop2 分布式/云計算/大數據

簡介:

Hive 是基于 Hadoop 的一個數據倉庫工具,可以將結構化的數據文件映射為一張數據庫表,并提供簡單的 sql 查詢功能,可以將 sql 語句轉換為 MapReduce 任務進行運行。 其優點是學習成本低,可以通過類 SQL 語句快速實現簡單的 MapReduce 統計,不必開發專門的 MapReduce 應用,十分適合數據倉庫的統計分析。

1,  適用場景

Hive  構建在基于靜態批處理的 Hadoop  之上, Hadoop  通常都有較高的延遲并且在作業提交和調度的時候需要大量的開銷。因此, Hive  并不能夠在大規模數據集上實現低延遲快速的查詢,例如, Hive  在幾百 MB  的數據集上執行查詢一般有分鐘級的時間延遲。因此,

Hive  并不適合那些需要低延遲的應用,例如,聯機事務處理( OLTP )。 Hive  查詢操作過程嚴格遵守 Hadoop MapReduce  的作業執行模型, Hive  將用戶的 HiveQL 語句通過解釋器轉換為 MapReduce  作業提交到 Hadoop  集群上, Hadoop  監控作業執行過程,然后返回作業執行結果給用戶。 Hive  并非為聯機事務處理而設計, Hive  并不提供實時的查詢和基于行級的數據更新操作。 Hive  的最佳使用場合是大數據集的批處理作業,例如,網絡日志分析。

2 ,下載安裝

前期hadoop安裝準備,參考: http://blog.itpub.net/26230597/viewspace-1257609/

下載地址

wget  http://mirror.bit.edu.cn/apache/hive/hive-0.13.1/apache-hive-0.13.1-bin.tar.gz

解壓安裝

tar zxvf apache-hive-0.13.1-bin.tar.gz  -C /home/hadoop/src/

PS Hive 只需要在一個節點上安裝即可,本例安裝在 name 節點上面的虛擬機上面,與 hadoop name 節點復用一臺虛擬機器。

3 ,配置 hive 環境變量

vim hive-env.sh

export HIVE_HOME=/home/hadoop/src/hive-0.13.1

export PATH=$PATH:$HIVE_HOME/bin

4 ,配置 hadoop 以及 hbase 參數

vim hive-env.sh

# Set HADOOP_HOME to point to a specific hadoop install directory</p>

HADOOP_HOME=/home/hadoop/src/hadoop-2.3.0/

# Hive Configuration Directory can be controlled by:</p>

export HIVE_CONF_DIR=/home/hadoop/src/hive-0.13.1/conf

# Folder containing extra ibraries required for hive compilation/execution can be controlled by:</p>

export HIVE_AUX_JARS_PATH=/home/hadoop/src/hive-0.13.1/lib

5 ,驗證安裝:

啟動 hive 命令行模式,出現 hive ,說明安裝成功了

[hadoop@name01 lib]$ hive --service cli

15/01/09 00:20:32 WARN conf.HiveConf: DEPRECATED: hive.metastore.ds.retry.* no longer has any effect.  Use hive.hmshandler.retry.* instead

Logging initialized using configuration in jar:file:/home/hadoop/src/hive-0.13.1/lib/hive-common-0.13.1.jar!/hive-log4j.properties

創建表,執行 create 命令,出現 OK ,說明命令執行成功,也說明 hive 安裝成功。

hive> create table test(key string);

OK

Time taken: 8.749 seconds

hive>

6 ,驗證可用性

啟動 hive

[hadoop@name01 root]$hive --service metastore &

查看后臺 hive 運行進程

[hadoop@name01 root]$ ps -eaf|grep hive

hadoop    4025  2460  1 22:52 pts/0    00:00:19 /usr/lib/jvm/jdk1.7.0_60/bin/java -Xmx256m -Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/home/hadoop/src/hadoop-2.3.0/logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/home/hadoop/src/hadoop-2.3.0 -Dhadoop.id.str=hadoop -Dhadoop.root.logger=INFO,console -Djava.library.path=/home/hadoop/src/hadoop-2.3.0/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Xmx512m -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.util.RunJar /home/hadoop/src/hive-0.13.1/lib/hive-service-0.13.1.jar org.apache.hadoop.hive.metastore.HiveMetaStore

hadoop    4575  4547  0 23:14 pts/1    00:00:00 grep hive

[hadoop@name01 root]$

6.1 hive 下執行命令,創建 2 個字段的表,字段間隔用 ’,’ 隔開:

hive> create table test(key string);

OK

Time taken: 8.749 seconds

hive> create table tim_test(id int,name string) row format delimited fields terminated by ',';

OK

Time taken: 0.145 seconds

hive>

6.2 準備導入到數據庫的 txt 文件,并輸入值:

[hadoop@name01 hive-0.13.1]$ more tim_hive_test.txt

123,xinhua

456,dingxilu

789,fanyulu

903,fahuazhengroad

[hadoop@name01 hive-0.13.1]$

6.4  再打開一個 xshell 端口,進入服務器端啟動 hive

[hadoop@name01 root]$ hive --service metastore

Starting Hive Metastore Server

6.5  再打開一個 xshell 端口,進入 hive 客戶端錄入數據:

[hadoop@name01 hive-0.13.1]$ hive

Logging initialized using configuration in jar:file:/home/hadoop/src/hive-0.13.1/lib/hive-common-0.13.1.jar!/hive-log4j.properties

hive> load data local inpath  '/home/hadoop/src/hive-0.13.1/tim_hive_test.txt'   into table tim_test;

Copying data from file:/home/hadoop/src/hive-0.13.1/tim_hive_test.txt

Copying file: file:/home/hadoop/src/hive-0.13.1/tim_hive_test.txt

Loading data to table default.tim_test

[Warning] could not update stats.

OK

Time taken: 7.208 seconds

hive>

6.6  驗證錄入數據是否成功,看到 dfs 出來有 tim_test

hive> dfs -ls /home/hadoop/hive/warehouse;

Found 2 items

drwxr-xr-x   - hadoop supergroup          0 2015-01-12 01:47 /home/hadoop/hive/warehouse/hive_hbase_mapping_table_1

drwxr-xr-x   - hadoop supergroup          0 2015-01-12 02:11 /home/hadoop/hive/warehouse/tim_test

hive>

7,安裝部署中的報錯記錄:
報錯
1

[hadoop@name01 conf]$ hive --service metastore

Starting Hive Metastore Server

javax.jdo.JDOFatalInternalException: Error creating transactional connection factory

Caused by: org.datanucleus.exceptions.NucleusException: Attempt to invoke the "BONECP" plugin to create a ConnectionPool gave an error : The specified datastore driver ("com.mysql.jdbc.Driver") was not found in the CLASSPATH. Please check your CLASSPATH specification, and the name of the driver.

缺少 mysql jar 包, copy hive lib 目錄下面, OK

報錯 2

[hadoop@name01 conf]$ hive --service metastore

Starting Hive Metastore Server

javax.jdo.JDOFatalDataStoreException: Unable to open a test connection to the given database. JDBC url = jdbc:mysql://192.168.52.130:3306/hive_remote?createDatabaseIfNotExist=true, username = root. Terminating connection pool (set lazyInit to true if you expect to start your database after your app). Original Exception: ------

java.sql.SQLException: null,  message from server: "Host '192.168.52.128' is not allowed to connect to this MySQL server"

hadoop 用戶添加到 mysql 組:

[root@data02 mysql]# gpasswd -a hadoop mysql

Adding user hadoop to group mysql

[root@data02 mysql]#

^C[hadoop@name01 conf]$ telnet 192.168.52.130 3306

Trying 192.168.52.130...

Connected to 192.168.52.130.

Escape character is '^]'.

G

Host '192.168.52.128' is not allowed to connect to this MySQL serverConnection closed by foreign host.

[hadoop@name01 conf]$

解決辦法:修改 mysql 賬號

mysql> update user set user = 'hadoop' where user = 'root' and host='%';

Query OK, 1 row affected (0.04 sec)

Rows matched: 1  Changed: 1  Warnings: 0

mysql> flush privileges;

Query OK, 0 rows affected (0.09 sec)

mysql>

報錯 3

[hadoop@name01 conf]$ hive --service metastore

Starting Hive Metastore Server

javax.jdo.JDOException: Exception thrown calling table.exists() for hive_remote.`SEQUENCE_TABLE`

at org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:596)

at org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPersistenceManager.java:732)

at org.datanucleus.api.jdo.JDOPersistenceManager.makePersistent(JDOPersistenceManager.java:752)

……

NestedThrowablesStackTrace:

com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Specified key was too long; max key length is 767 bytes

at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)

at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)

解決,去遠程 mysql 庫上修改字符集從 utf8mb4 修改成 utf8

mysql> alter database hive_remote /*!40100 DEFAULT CHARACTER SET utf8 */;

Query OK, 1 row affected (0.03 sec)

mysql>

然后在 data01 上面配置 hive client

scp -r hive-0.13.1/ data01:/home/hadoop/src/

報錯 3

繼續啟動,查看日志信息:

[hadoop@name01 conf]$ hive --service metastore

Starting Hive Metastore Server

卡在這里不動,去看日志信息

[hadoop@name01 hadoop]$ tail -f hive.log

2015-01-09 03:46:27,692 INFO  [main]: metastore.ObjectStore (ObjectStore.java:setConf(229)) - Initialized ObjectStore

2015-01-09 03:46:27,892 WARN  [main]: metastore.ObjectStore (ObjectStore.java:checkSchema(6295)) - Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 0.13.0

2015-01-09 03:46:30,574 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:createDefaultRoles(551)) - Added admin role in metastore

2015-01-09 03:46:30,582 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:createDefaultRoles(560)) - Added public role in metastore

2015-01-09 03:46:31,168 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:addAdminUsers(588)) - No user is added in admin role, since config is empty

2015-01-09 03:46:31,473 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:startMetaStore(5178)) - Starting DB backed MetaStore Server

2015-01-09 03:46:31,481 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:startMetaStore(5190)) - Started the new metaserver on port [9083]...

2015-01-09 03:46:31,481 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:startMetaStore(5192)) - Options.minWorkerThreads = 200

2015-01-09 03:46:31,482 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:startMetaStore(5194)) - Options.maxWorkerThreads = 100000

2015-01-09 03:46:31,482 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:startMetaStore(5196)) - TCP keepalive = true

hive-site.xml 上添加如下:

<property>

<name>hive.metastore.uris</name>

<value>thrift://192.168.52.128:9083</value>

</property>

報錯 4

2015-01-09 04:01:43,053 INFO  [main]: metastore.ObjectStore (ObjectStore.java:setConf(229)) - Initialized ObjectStore

2015-01-09 04:01:43,540 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:createDefaultRoles(551)) - Added admin role in metastore

2015-01-09 04:01:43,546 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:createDefaultRoles(560)) - Added public role in metastore

2015-01-09 04:01:43,684 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:addAdminUsers(588)) - No user is added in admin role, since config is empty

2015-01-09 04:01:44,041 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:startMetaStore(5178)) - Starting DB backed MetaStore Server

2015-01-09 04:01:44,054 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:startMetaStore(5190)) - Started the new metaserver on port [9083]...

2015-01-09 04:01:44,054 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:startMetaStore(5192)) - Options.minWorkerThreads = 200

2015-01-09 04:01:44,054 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:startMetaStore(5194)) - Options.maxWorkerThreads = 100000

2015-01-09 04:01:44,054 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:startMetaStore(5196)) - TCP keepalive = true

2015-01-09 04:24:13,917 INFO  [Thread-3]: metastore.HiveMetaStore (HiveMetaStore.java:run(5073)) - Shutting down hive metastore.

解決:

查了好久, No user is added in admin role, since config is empty 沒有查到問題所在,碰到此類情況的一起交流下,歡迎留言。

-------- - ------- ----------------------------------------------------------------- - ------------------------------ </span>

原博客地址:      http://blog.itpub.net/26230597/viewspace-1400379/
原作者: 黃杉 (mchdba)

 本文由用戶 jopen 自行上傳分享,僅供網友學習交流。所有權歸原作者,若您的權利被侵害,請聯系管理員。
 轉載本站原創文章,請注明出處,并保留原始鏈接、圖片水印。
 本站是一個以用戶分享為主的開源技術平臺,歡迎各類分享!