hadoop分布式集群搭建
hadoop版本:hadoop-0.20.205.0-1.i386.rpm
下載地址:http://www.fayea.com/apache-mirror/hadoop/common/hadoop-0.20.205.0/
jdk版本:jdk-6u35-linux-i586-rpm.bin
下載地址:http://www.oracle.com/technetwork/java/javase/downloads/jdk6u35-downloads-1836443.html
環境為redhat6.2 32bit
master: 192.169.1.133
slave1: 192.169.1.134
slave2: 192.169.1.135
總體的步驟:
1.修改主機名/etc/hosts(虛擬機拷貝后若不一致要從新修改,從新分發)
2.創建一個普通賬戶(hadoop),hadoop以此賬戶運行。
2.root安裝jdk
3.修改環境變量
4.安裝hadoop,修改配置文件
5.將虛擬機拷貝2份,分別作為slave1,slave2
6.配置ssh,使兩兩之間,自己登陸自己都免密碼
7.用普通賬戶格式化namenode
8.啟動,并觀察是否正常運行了
注意兩個錯誤:
1.Warning: $HADOOP_HOME is deprecated. 關閉
解決方法:將export HADOOP_HOME_WARN_SUPPRESS=TRUE添加到每個節點的/etc/hadoop/hadoop-env.sh配置文件中。
2.提示不能創建虛擬機錯誤
#[root@master ~]# /usr/bin/start-all.sh
namenode running as process 26878. Stop it first.
slave2: starting datanode, logging to /var/log/hadoop/root/hadoop-root-datanode-slave2.out
slave1: starting datanode, logging to /var/log/hadoop/root/hadoop-root-datanode-slave1.out
slave2: Unrecognized option: -jvm
slave2: Could not create the Java virtual machine.
slave1: Unrecognized option: -jvm
slave1: Could not create the Java virtual machine.
master: secondarynamenode running as process 26009. Stop it first.
jobtracker running as process 25461. Stop it first.
slave2: starting tasktracker, logging to /var/log/hadoop/root/hadoop-root-tasktracker-slave2.out
slave1: starting tasktracker, logging to /var/log/hadoop/root/hadoop-root-tasktracker-slave1.out
解決方法:root不能啟動hadoop,需要用普通賬戶啟動。
------------------------1.設置主機ip地址映射/etc/hosts-----------------------------
------------------------2.添加hadoop用戶,作為運行hadoop的用戶---------------------
------------------------3.安裝jdk并設置環境變量------------------------------------
[chen@master 桌面]$ su - root
密碼:
[root@master ~]# useradd hadoop
[root@master ~]# passwd hadoop
更改用戶 hadoop 的密碼 。
新的 密碼:
無效的密碼: 過短
無效的密碼: 過于簡單
重新輸入新的 密碼:
passwd: 所有的身份驗證令牌已經成功更新。
[root@master ~]# vim /etc/hosts
[root@master ~]# cat /etc/hosts
192.169.1.133 master
192.169.1.134 slave1
192.169.1.135 slave2
[root@master ~]# cd /home/chen/
[root@master chen]# ls
hadoop-0.20.205.0-1.i386.rpm 公共的 視頻 文檔 音樂
jdk-6u35-linux-i586-rpm.bin 模板 圖片 下載 桌面
[root@master chen]# chmod 744 jdk-6u35-linux-i586-rpm.bin #給bin執行權限
[root@master chen]# ./jdk-6u35-linux-i586-rpm.bin
Unpacking...
Checksumming...
Extracting...
UnZipSFX 5.50 of 17 February 2002, by Info-ZIP (Zip-Bugs@lists.wku.edu).
inflating: jdk-6u35-linux-i586.rpm
inflating: sun-javadb-common-10.6.2-1.1.i386.rpm
inflating: sun-javadb-core-10.6.2-1.1.i386.rpm
inflating: sun-javadb-client-10.6.2-1.1.i386.rpm
inflating: sun-javadb-demo-10.6.2-1.1.i386.rpm
inflating: sun-javadb-docs-10.6.2-1.1.i386.rpm
inflating: sun-javadb-javadoc-10.6.2-1.1.i386.rpm
Preparing... ########################################### [100%]
1:jdk ########################################### [100%]
Unpacking JAR files...
rt.jar...
jsse.jar...
charsets.jar...
tools.jar...
localedata.jar...
plugin.jar...
javaws.jar...
deploy.jar...
Installing JavaDB
Preparing... ########################################### [100%]
1:sun-javadb-common ########################################### [ 17%]
2:sun-javadb-core ########################################### [ 33%]
3:sun-javadb-client ########################################### [ 50%]
4:sun-javadb-demo ########################################### [ 67%]
5:sun-javadb-docs ########################################### [ 83%]
6:sun-javadb-javadoc ########################################### [100%]
Java(TM) SE Development Kit 6 successfully installed.
Product Registration is FREE and includes many benefits:
* Notification of new versions, patches, and updates
* Special offers on Oracle products, services and training
* Access to early releases and documentation
Product and system data will be collected. If your configuration
supports a browser, the JDK Product Registration form will
be presented. If you do not register, none of this information
will be saved. You may also register your JDK later by
opening the register.html file (located in the JDK installation
directory) in a browser.
For more information on what data Registration collects and
how it is managed and used, see:
http://java.sun.com/javase/registration/JDKRegistrationPrivacy.html
Press Enter to continue.....
Done.
[root@master chen]# vim /etc/profile
[root@master chen]# ls /usr/java/jdk1.6.0_35/
bin lib register.html THIRDPARTYLICENSEREADME.txt
COPYRIGHT LICENSE register_ja.html
include man register_zh_CN.html
jre README.html src.zip
[root@master chen]# tail -3 /etc/profile #設置環境變量
export JAVA_HOME=/usr/java/jdk1.6.0_35
export CLASSPATH=.:$JAVA_HOME/lib:$JAVA_HOME/jre/lib
export PATH=$JAVA_HOME/bin:$JAVA_HOME/jre/bin:$PATH
[root@master chen]#
-----------------------安裝hadoop,修改配置文件-----------------------
#這一步完后后,將虛擬機拷貝兩份,分別作為slave1,slave2
#若啟動兩個拷貝后,ip地址和hosts不一樣,要改為實際的ip
[root@master chen]# ls
hadoop-0.20.205.0-1.i386.rpm 公共的
jdk-6u35-linux-i586.rpm 模板
jdk-6u35-linux-i586-rpm.bin 視頻
sun-javadb-client-10.6.2-1.1.i386.rpm 圖片
sun-javadb-common-10.6.2-1.1.i386.rpm 文檔
sun-javadb-core-10.6.2-1.1.i386.rpm 下載
sun-javadb-demo-10.6.2-1.1.i386.rpm 音樂
sun-javadb-docs-10.6.2-1.1.i386.rpm 桌面
sun-javadb-javadoc-10.6.2-1.1.i386.rpm
[root@master chen]# rpm -ivh hadoop-0.20.205.0-1.i386.rpm
Preparing... ########################################### [100%]
1:hadoop ########################################### [100%]
[root@master chen]# cd /etc/hadoop/
[root@master hadoop]# ls
capacity-scheduler.xml hadoop-policy.xml slaves
configuration.xsl hdfs-site.xml ssl-client.xml.example
core-site.xml log4j.properties ssl-server.xml.example
fair-scheduler.xml mapred-queue-acls.xml taskcontroller.cfg
hadoop-env.sh mapred-site.xml
hadoop-metrics2.properties masters
[root@master hadoop]# vim hadoop-env.sh
export JAVA_HOME=/usr/java/jdk1.6.0_35
[root@master hadoop]# vim core-site.xml
[root@master hadoop]# cat core-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://master:9000</value>
</property>
</configuration>
[root@master hadoop]# vim hdfs-site.xml
[root@master hadoop]# cat hdfs-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
</configuration>
[root@master hadoop]# vim mapred-site.xml
[root@master hadoop]# cat mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>hdfs://master:9001</value>
</property>
</configuration>
[root@master hadoop]# cat masters
master
[root@master hadoop]# cat slaves
slave1
slave2
[root@master hadoop]#
------------------------下面切換到hadoop用戶,設置ssh免密碼登陸------------------------------
[hadoop@master ~]$ ssh-keygen -t dsa #這個步驟在兩個從節點上也要做一遍
Generating public/private dsa key pair.
Enter file in which to save the key (/home/hadoop/.ssh/id_dsa):
Created directory '/home/hadoop/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/hadoop/.ssh/id_dsa.
Your public key has been saved in /home/hadoop/.ssh/id_dsa.pub.
The key fingerprint is:
6f:88:68:8a:d6:c7:b0:c7:e2:8b:b7:fa:7b:b4:a1:56 hadoop@master
The key's randomart image is:
+--[ DSA 1024]----+
| |
| |
| |
| |
| S |
| . E . o |
| . @ + . o |
| o.X B . |
|ooB*O |
+-----------------+
[hadoop@master ~]$ cd .ssh/
[hadoop@master .ssh]$ ls
id_dsa id_dsa.pub
[hadoop@master .ssh]$ cp id_dsa.pub authorized_keys #必須要將公鑰改名為authorized_keys
#編輯authorized_keys文件,將兩個從節點中生成的公鑰id_dsa.pub中的內容拷貝到authorized_keys中
[hadoop@master .ssh]$ vim authorized_keys
[hadoop@master .ssh]$ exit
logout
[chen@master .ssh]$ su - root
密碼:
[root@master ~]# cd /home/hadoop/
[root@master hadoop]# ls
[root@master hadoop]# cd .ssh/
[root@master .ssh]# ls
authorized_keys id_dsa id_dsa.pub
#切換到root,將authorized_keys分別拷貝到兩個從節點的/home/hadoop/.ssh下
#這里root拷貝的時候不需要輸入密碼,因為之前也被我設置免密碼了
[root@master .ssh]# scp authorized_keys slave1:/home/hadoop/.ssh/
authorized_keys 100% 1602 1.6KB/s 00:00
[root@master .ssh]# scp authorized_keys slave2:/home/hadoop/.ssh/
authorized_keys 100% 1602 1.6KB/s 00:00
[root@master .ssh]#
#這樣拷貝完后,三臺機用hadoop用戶ssh登陸就不需要密碼了,
#注意,第一次登陸需要,然后再登陸就不需要了,一定要兩兩之間
#自己登陸自己都走一遍
-------------------------------格式化并啟動hadoop---------------------------
#注意:把三臺機的防火墻都關掉測試。
[hadoop@master ~]$ /usr/bin/hadoop namenode -format
12/09/01 16:52:24 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = master/192.169.1.133
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 0.20.205.0
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20-security-205 -r 1179940; compiled by 'hortonfo' on Fri Oct 7 06:19:16 UTC 2011
************************************************************/
12/09/01 16:52:24 INFO util.GSet: VM type = 32-bit
12/09/01 16:52:24 INFO util.GSet: 2% max memory = 2.475 MB
12/09/01 16:52:24 INFO util.GSet: capacity = 2^19 = 524288 entries
12/09/01 16:52:24 INFO util.GSet: recommended=524288, actual=524288
12/09/01 16:52:24 INFO namenode.FSNamesystem: fsOwner=hadoop
12/09/01 16:52:24 INFO namenode.FSNamesystem: supergroup=supergroup
12/09/01 16:52:24 INFO namenode.FSNamesystem: isPermissionEnabled=true
12/09/01 16:52:24 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
12/09/01 16:52:24 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
12/09/01 16:52:24 INFO namenode.NameNode: Caching file names occuring more than 10 times
12/09/01 16:52:24 INFO common.Storage: Image file of size 112 saved in 0 seconds.
12/09/01 16:52:25 INFO common.Storage: Storage directory /tmp/hadoop-hadoop/dfs/name has been successfully formatted.
12/09/01 16:52:25 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at master/192.169.1.133
************************************************************/
[hadoop@master ~]$ /usr/bin/start-all.sh #啟動信息像下面這樣就正常啟動了
starting namenode, logging to /var/log/hadoop/hadoop/hadoop-hadoop-namenode-master.out
slave2: starting datanode, logging to /var/log/hadoop/hadoop/hadoop-hadoop-datanode-slave2.out
slave1: starting datanode, logging to /var/log/hadoop/hadoop/hadoop-hadoop-datanode-slave1.out
master: starting secondarynamenode, logging to /var/log/hadoop/hadoop/hadoop-hadoop-secondarynamenode-master.out
starting jobtracker, logging to /var/log/hadoop/hadoop/hadoop-hadoop-jobtracker-master.out
slave2: starting tasktracker, logging to /var/log/hadoop/hadoop/hadoop-hadoop-tasktracker-slave2.out
slave1: starting tasktracker, logging to /var/log/hadoop/hadoop/hadoop-hadoop-tasktracker-slave1.out
#但是通過查看java的進程信息查不到,可能原因有兩個:
#1.防火墻沒關
#2.若防火墻關了還這樣,重啟。
[hadoop@master ~]$ /usr/java/jdk1.6.0_35/bin/jps
28499 Jps
[root@master ~]# iptables -F
[root@master ~]# exit
logout
[hadoop@master ~]$ /usr/bin/start-all.sh
starting namenode, logging to /var/log/hadoop/hadoop/hadoop-hadoop-namenode-master.out
slave2: starting datanode, logging to /var/log/hadoop/hadoop/hadoop-hadoop-datanode-slave2.out
slave1: starting datanode, logging to /var/log/hadoop/hadoop/hadoop-hadoop-datanode-slave1.out
master: starting secondarynamenode, logging to /var/log/hadoop/hadoop/hadoop-hadoop-secondarynamenode-master.out
starting jobtracker, logging to /var/log/hadoop/hadoop/hadoop-hadoop-jobtracker-master.out
slave2: starting tasktracker, logging to /var/log/hadoop/hadoop/hadoop-hadoop-tasktracker-slave2.out
slave1: starting tasktracker, logging to /var/log/hadoop/hadoop/hadoop-hadoop-tasktracker-slave1.out
[hadoop@master ~]$ /usr/java/jdk1.6.0_35/bin/jps
30630 Jps
---------------------------重啟后正常了----------------------
------------------------master節點---------------------------
[hadoop@master ~]$ /usr/bin/start-all.sh
starting namenode, logging to /var/log/hadoop/hadoop/hadoop-hadoop-namenode-master.out
slave2: starting datanode, logging to /var/log/hadoop/hadoop/hadoop-hadoop-datanode-slave2.out
slave1: starting datanode, logging to /var/log/hadoop/hadoop/hadoop-hadoop-datanode-slave1.out
master: starting secondarynamenode, logging to /var/log/hadoop/hadoop/hadoop-hadoop-secondarynamenode-master.out
starting jobtracker, logging to /var/log/hadoop/hadoop/hadoop-hadoop-jobtracker-master.out
slave2: starting tasktracker, logging to /var/log/hadoop/hadoop/hadoop-hadoop-tasktracker-slave2.out
slave1: starting tasktracker, logging to /var/log/hadoop/hadoop/hadoop-hadoop-tasktracker-slave1.out
[hadoop@master ~]$ /usr/java/jdk1.6.0_35/bin/jps
3388 JobTracker
3312 SecondaryNameNode
3159 NameNode
3533 Jps
------------------------salve1---------------------------------
[hadoop@master ~]$ ssh slave1
Last login: Sat Sep 1 16:51:48 2012 from slave2
[hadoop@slave1 ~]$ su - root
密碼:
[root@slave1 ~]# iptables -F
[root@slave1 ~]# setenforce 0
[root@slave1 ~]# exit
logout
[hadoop@slave1 ~]$ /usr/java/jdk1.6.0_35/bin/jps
3181 TaskTracker
3107 DataNode
3227 Jps
--------------------------slave2------------------------------
[hadoop@master ~]$ ssh slave2
Last login: Sat Sep 1 16:52:02 2012 from slave2
[hadoop@slave2 ~]$ su - root
密碼:
[root@slave2 ~]# iptables -F
[root@slave2 ~]# setenforce 0
[root@slave2 ~]# exit
logout
[hadoop@slave2 ~]$ /usr/java/jdk1.6.0_35/bin/jps
3165 DataNode
3297 Jps
3241 TaskTracker
[hadoop@slave2 ~]$ 本文由用戶 openkk 自行上傳分享,僅供網友學習交流。所有權歸原作者,若您的權利被侵害,請聯系管理員。
轉載本站原創文章,請注明出處,并保留原始鏈接、圖片水印。
本站是一個以用戶分享為主的開源技術平臺,歡迎各類分享!