hadoop分布式集群搭建

jopen 12年前發布 | 2K 次閱讀 fast el 拉手

hadoop版本:hadoop-0.20.205.0-1.i386.rpm 
下載地址:http://www.fayea.com/apache-mirror/hadoop/common/hadoop-0.20.205.0/
jdk版本:jdk-6u35-linux-i586-rpm.bin  

下載地址:http://www.oracle.com/technetwork/java/javase/downloads/jdk6u35-downloads-1836443.html

 

環境為redhat6.2 32bit
master: 192.169.1.133
slave1: 192.169.1.134
slave2: 192.169.1.135

總體的步驟:
1.修改主機名/etc/hosts(虛擬機拷貝后若不一致要從新修改,從新分發)
2.創建一個普通賬戶(hadoop),hadoop以此賬戶運行。
2.root安裝jdk
3.修改環境變量
4.安裝hadoop,修改配置文件
5.將虛擬機拷貝2份,分別作為slave1,slave2
6.配置ssh,使兩兩之間,自己登陸自己都免密碼
7.用普通賬戶格式化namenode
8.啟動,并觀察是否正常運行了


注意兩個錯誤:
1.Warning: $HADOOP_HOME is deprecated. 關閉
解決方法:將export HADOOP_HOME_WARN_SUPPRESS=TRUE添加到每個節點的/etc/hadoop/hadoop-env.sh配置文件中。
2.提示不能創建虛擬機錯誤
#[root@master ~]# /usr/bin/start-all.sh 
namenode running as process 26878. Stop it first.
slave2: starting datanode, logging to /var/log/hadoop/root/hadoop-root-datanode-slave2.out
slave1: starting datanode, logging to /var/log/hadoop/root/hadoop-root-datanode-slave1.out
slave2: Unrecognized option: -jvm
slave2: Could not create the Java virtual machine.
slave1: Unrecognized option: -jvm
slave1: Could not create the Java virtual machine.
master: secondarynamenode running as process 26009. Stop it first.
jobtracker running as process 25461. Stop it first.
slave2: starting tasktracker, logging to /var/log/hadoop/root/hadoop-root-tasktracker-slave2.out
slave1: starting tasktracker, logging to /var/log/hadoop/root/hadoop-root-tasktracker-slave1.out

解決方法:root不能啟動hadoop,需要用普通賬戶啟動。

------------------------1.設置主機ip地址映射/etc/hosts-----------------------------
------------------------2.添加hadoop用戶,作為運行hadoop的用戶---------------------
------------------------3.安裝jdk并設置環境變量------------------------------------

[chen@master 桌面]$ su - root
密碼:
[root@master ~]# useradd hadoop
[root@master ~]# passwd hadoop
更改用戶 hadoop 的密碼 。
新的 密碼:
無效的密碼: 過短
無效的密碼: 過于簡單
重新輸入新的 密碼:
passwd: 所有的身份驗證令牌已經成功更新。
[root@master ~]# vim /etc/hosts
[root@master ~]# cat /etc/hosts
192.169.1.133 master
192.169.1.134 slave1
192.169.1.135 slave2
[root@master ~]# cd /home/chen/
[root@master chen]# ls
hadoop-0.20.205.0-1.i386.rpm  公共的  視頻  文檔  音樂
jdk-6u35-linux-i586-rpm.bin   模板    圖片  下載  桌面
[root@master chen]# chmod 744 jdk-6u35-linux-i586-rpm.bin  #給bin執行權限
[root@master chen]# ./jdk-6u35-linux-i586-rpm.bin 
Unpacking...
Checksumming...
Extracting...
UnZipSFX 5.50 of 17 February 2002, by Info-ZIP (Zip-Bugs@lists.wku.edu).
  inflating: jdk-6u35-linux-i586.rpm  
  inflating: sun-javadb-common-10.6.2-1.1.i386.rpm  
  inflating: sun-javadb-core-10.6.2-1.1.i386.rpm  
  inflating: sun-javadb-client-10.6.2-1.1.i386.rpm  
  inflating: sun-javadb-demo-10.6.2-1.1.i386.rpm  
  inflating: sun-javadb-docs-10.6.2-1.1.i386.rpm  
  inflating: sun-javadb-javadoc-10.6.2-1.1.i386.rpm  
Preparing...                ########################################### [100%]
   1:jdk                    ########################################### [100%]
Unpacking JAR files...
    rt.jar...
    jsse.jar...
    charsets.jar...
    tools.jar...
    localedata.jar...
    plugin.jar...
    javaws.jar...
    deploy.jar...
Installing JavaDB
Preparing...                ########################################### [100%]
   1:sun-javadb-common      ########################################### [ 17%]
   2:sun-javadb-core        ########################################### [ 33%]
   3:sun-javadb-client      ########################################### [ 50%]
   4:sun-javadb-demo        ########################################### [ 67%]
   5:sun-javadb-docs        ########################################### [ 83%]
   6:sun-javadb-javadoc     ########################################### [100%]

Java(TM) SE Development Kit 6 successfully installed.

Product Registration is FREE and includes many benefits:
* Notification of new versions, patches, and updates
* Special offers on Oracle products, services and training
* Access to early releases and documentation

Product and system data will be collected. If your configuration
supports a browser, the JDK Product Registration form will
be presented. If you do not register, none of this information
will be saved. You may also register your JDK later by
opening the register.html file (located in the JDK installation
directory) in a browser.

For more information on what data Registration collects and 
how it is managed and used, see:
http://java.sun.com/javase/registration/JDKRegistrationPrivacy.html

Press Enter to continue.....


Done.
[root@master chen]# vim /etc/profile
[root@master chen]# ls /usr/java/jdk1.6.0_35/
bin        lib          register.html        THIRDPARTYLICENSEREADME.txt
COPYRIGHT  LICENSE      register_ja.html
include    man          register_zh_CN.html
jre        README.html  src.zip
[root@master chen]# tail -3 /etc/profile    #設置環境變量
export JAVA_HOME=/usr/java/jdk1.6.0_35
export CLASSPATH=.:$JAVA_HOME/lib:$JAVA_HOME/jre/lib
export PATH=$JAVA_HOME/bin:$JAVA_HOME/jre/bin:$PATH
[root@master chen]# 

-----------------------安裝hadoop,修改配置文件-----------------------
#這一步完后后,將虛擬機拷貝兩份,分別作為slave1,slave2
#若啟動兩個拷貝后,ip地址和hosts不一樣,要改為實際的ip

[root@master chen]# ls
hadoop-0.20.205.0-1.i386.rpm            公共的
jdk-6u35-linux-i586.rpm                 模板
jdk-6u35-linux-i586-rpm.bin             視頻
sun-javadb-client-10.6.2-1.1.i386.rpm   圖片
sun-javadb-common-10.6.2-1.1.i386.rpm   文檔
sun-javadb-core-10.6.2-1.1.i386.rpm     下載
sun-javadb-demo-10.6.2-1.1.i386.rpm     音樂
sun-javadb-docs-10.6.2-1.1.i386.rpm     桌面
sun-javadb-javadoc-10.6.2-1.1.i386.rpm
[root@master chen]# rpm -ivh hadoop-0.20.205.0-1.i386.rpm 
Preparing...                ########################################### [100%]
   1:hadoop                 ########################################### [100%]
[root@master chen]# cd /etc/hadoop/
[root@master hadoop]# ls
capacity-scheduler.xml      hadoop-policy.xml      slaves
configuration.xsl           hdfs-site.xml          ssl-client.xml.example
core-site.xml               log4j.properties       ssl-server.xml.example
fair-scheduler.xml          mapred-queue-acls.xml  taskcontroller.cfg
hadoop-env.sh               mapred-site.xml
hadoop-metrics2.properties  masters
[root@master hadoop]# vim hadoop-env.sh 

export JAVA_HOME=/usr/java/jdk1.6.0_35

[root@master hadoop]# vim core-site.xml 
[root@master hadoop]# cat core-site.xml 
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://master:9000</value>
</property>
</configuration>

[root@master hadoop]# vim hdfs-site.xml 
[root@master hadoop]# cat hdfs-site.xml 
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
<property>

<name>dfs.replication</name>

<value>2</value>

</property>
</configuration>

[root@master hadoop]# vim mapred-site.xml 
[root@master hadoop]# cat mapred-site.xml 
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
<property>

<name>mapred.job.tracker</name>

<value>hdfs://master:9001</value>

</property>
</configuration>
[root@master hadoop]# cat masters 
master
[root@master hadoop]# cat slaves 
slave1
slave2
[root@master hadoop]# 


------------------------下面切換到hadoop用戶,設置ssh免密碼登陸------------------------------

[hadoop@master ~]$ ssh-keygen -t dsa    #這個步驟在兩個從節點上也要做一遍
Generating public/private dsa key pair.
Enter file in which to save the key (/home/hadoop/.ssh/id_dsa): 
Created directory '/home/hadoop/.ssh'.
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /home/hadoop/.ssh/id_dsa.
Your public key has been saved in /home/hadoop/.ssh/id_dsa.pub.
The key fingerprint is:
6f:88:68:8a:d6:c7:b0:c7:e2:8b:b7:fa:7b:b4:a1:56 hadoop@master
The key's randomart image is:
+--[ DSA 1024]----+
|                 |
|                 |
|                 |
|                 |
|        S        |
|   . E . o       |
|  . @ + . o      |
| o.X B   .       |
|ooB*O            |
+-----------------+
[hadoop@master ~]$ cd .ssh/
[hadoop@master .ssh]$ ls
id_dsa  id_dsa.pub
[hadoop@master .ssh]$ cp id_dsa.pub authorized_keys #必須要將公鑰改名為authorized_keys

#編輯authorized_keys文件,將兩個從節點中生成的公鑰id_dsa.pub中的內容拷貝到authorized_keys中
[hadoop@master .ssh]$ vim authorized_keys       
[hadoop@master .ssh]$ exit
logout
[chen@master .ssh]$ su - root
密碼:
[root@master ~]# cd /home/hadoop/
[root@master hadoop]# ls
[root@master hadoop]# cd .ssh/
[root@master .ssh]# ls
authorized_keys  id_dsa  id_dsa.pub 

#切換到root,將authorized_keys分別拷貝到兩個從節點的/home/hadoop/.ssh下
#這里root拷貝的時候不需要輸入密碼,因為之前也被我設置免密碼了
[root@master .ssh]# scp authorized_keys slave1:/home/hadoop/.ssh/
authorized_keys                                                100% 1602     1.6KB/s   00:00    
[root@master .ssh]# scp authorized_keys slave2:/home/hadoop/.ssh/
authorized_keys                                                100% 1602     1.6KB/s   00:00    
[root@master .ssh]#

#這樣拷貝完后,三臺機用hadoop用戶ssh登陸就不需要密碼了,
#注意,第一次登陸需要,然后再登陸就不需要了,一定要兩兩之間
#自己登陸自己都走一遍


-------------------------------格式化并啟動hadoop---------------------------
#注意:把三臺機的防火墻都關掉測試。
[hadoop@master ~]$ /usr/bin/hadoop namenode -format
12/09/01 16:52:24 INFO namenode.NameNode: STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = master/192.169.1.133
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 0.20.205.0
STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20-security-205 -r 1179940; compiled by 'hortonfo' on Fri Oct  7 06:19:16 UTC 2011
************************************************************/
12/09/01 16:52:24 INFO util.GSet: VM type       = 32-bit
12/09/01 16:52:24 INFO util.GSet: 2% max memory = 2.475 MB
12/09/01 16:52:24 INFO util.GSet: capacity      = 2^19 = 524288 entries
12/09/01 16:52:24 INFO util.GSet: recommended=524288, actual=524288
12/09/01 16:52:24 INFO namenode.FSNamesystem: fsOwner=hadoop
12/09/01 16:52:24 INFO namenode.FSNamesystem: supergroup=supergroup
12/09/01 16:52:24 INFO namenode.FSNamesystem: isPermissionEnabled=true
12/09/01 16:52:24 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
12/09/01 16:52:24 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
12/09/01 16:52:24 INFO namenode.NameNode: Caching file names occuring more than 10 times 
12/09/01 16:52:24 INFO common.Storage: Image file of size 112 saved in 0 seconds.
12/09/01 16:52:25 INFO common.Storage: Storage directory /tmp/hadoop-hadoop/dfs/name has been successfully formatted.
12/09/01 16:52:25 INFO namenode.NameNode: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at master/192.169.1.133
************************************************************/
[hadoop@master ~]$ /usr/bin/start-all.sh    #啟動信息像下面這樣就正常啟動了
starting namenode, logging to /var/log/hadoop/hadoop/hadoop-hadoop-namenode-master.out
slave2: starting datanode, logging to /var/log/hadoop/hadoop/hadoop-hadoop-datanode-slave2.out
slave1: starting datanode, logging to /var/log/hadoop/hadoop/hadoop-hadoop-datanode-slave1.out
master: starting secondarynamenode, logging to /var/log/hadoop/hadoop/hadoop-hadoop-secondarynamenode-master.out
starting jobtracker, logging to /var/log/hadoop/hadoop/hadoop-hadoop-jobtracker-master.out
slave2: starting tasktracker, logging to /var/log/hadoop/hadoop/hadoop-hadoop-tasktracker-slave2.out
slave1: starting tasktracker, logging to /var/log/hadoop/hadoop/hadoop-hadoop-tasktracker-slave1.out
#但是通過查看java的進程信息查不到,可能原因有兩個:
#1.防火墻沒關
#2.若防火墻關了還這樣,重啟。
[hadoop@master ~]$ /usr/java/jdk1.6.0_35/bin/jps    
28499 Jps
[root@master ~]# iptables -F
[root@master ~]# exit
logout
[hadoop@master ~]$ /usr/bin/start-all.sh 
starting namenode, logging to /var/log/hadoop/hadoop/hadoop-hadoop-namenode-master.out
slave2: starting datanode, logging to /var/log/hadoop/hadoop/hadoop-hadoop-datanode-slave2.out
slave1: starting datanode, logging to /var/log/hadoop/hadoop/hadoop-hadoop-datanode-slave1.out
master: starting secondarynamenode, logging to /var/log/hadoop/hadoop/hadoop-hadoop-secondarynamenode-master.out
starting jobtracker, logging to /var/log/hadoop/hadoop/hadoop-hadoop-jobtracker-master.out
slave2: starting tasktracker, logging to /var/log/hadoop/hadoop/hadoop-hadoop-tasktracker-slave2.out
slave1: starting tasktracker, logging to /var/log/hadoop/hadoop/hadoop-hadoop-tasktracker-slave1.out
[hadoop@master ~]$ /usr/java/jdk1.6.0_35/bin/jps 
30630 Jps
---------------------------重啟后正常了----------------------
------------------------master節點---------------------------
[hadoop@master ~]$ /usr/bin/start-all.sh 
starting namenode, logging to /var/log/hadoop/hadoop/hadoop-hadoop-namenode-master.out
slave2: starting datanode, logging to /var/log/hadoop/hadoop/hadoop-hadoop-datanode-slave2.out
slave1: starting datanode, logging to /var/log/hadoop/hadoop/hadoop-hadoop-datanode-slave1.out
master: starting secondarynamenode, logging to /var/log/hadoop/hadoop/hadoop-hadoop-secondarynamenode-master.out
starting jobtracker, logging to /var/log/hadoop/hadoop/hadoop-hadoop-jobtracker-master.out
slave2: starting tasktracker, logging to /var/log/hadoop/hadoop/hadoop-hadoop-tasktracker-slave2.out
slave1: starting tasktracker, logging to /var/log/hadoop/hadoop/hadoop-hadoop-tasktracker-slave1.out
[hadoop@master ~]$ /usr/java/jdk1.6.0_35/bin/jps 
3388 JobTracker
3312 SecondaryNameNode
3159 NameNode
3533 Jps

------------------------salve1---------------------------------

[hadoop@master ~]$ ssh slave1
Last login: Sat Sep  1 16:51:48 2012 from slave2
[hadoop@slave1 ~]$ su - root
密碼:
[root@slave1 ~]# iptables -F
[root@slave1 ~]# setenforce 0
[root@slave1 ~]# exit
logout
[hadoop@slave1 ~]$ /usr/java/jdk1.6.0_35/bin/jps 
3181 TaskTracker
3107 DataNode
3227 Jps

--------------------------slave2------------------------------
[hadoop@master ~]$ ssh slave2
Last login: Sat Sep  1 16:52:02 2012 from slave2
[hadoop@slave2 ~]$ su - root
密碼:
[root@slave2 ~]# iptables -F
[root@slave2 ~]# setenforce 0
[root@slave2 ~]# exit
logout
[hadoop@slave2 ~]$ /usr/java/jdk1.6.0_35/bin/jps 
3165 DataNode
3297 Jps
3241 TaskTracker
[hadoop@slave2 ~]$ 

 本文由用戶 jopen 自行上傳分享,僅供網友學習交流。所有權歸原作者,若您的權利被侵害,請聯系管理員。
 轉載本站原創文章,請注明出處,并保留原始鏈接、圖片水印。
 本站是一個以用戶分享為主的開源技術平臺,歡迎各類分享!