hadoop分布式集群搭建
hadoop版本:hadoop-0.20.205.0-1.i386.rpm
下載地址:http://www.fayea.com/apache-mirror/hadoop/common/hadoop-0.20.205.0/
jdk版本:jdk-6u35-linux-i586-rpm.bin
下載地址:http://www.oracle.com/technetwork/java/javase/downloads/jdk6u35-downloads-1836443.html
環境為redhat6.2 32bit
master: 192.169.1.133
slave1: 192.169.1.134
slave2: 192.169.1.135
總體的步驟:
1.修改主機名/etc/hosts(虛擬機拷貝后若不一致要從新修改,從新分發)
2.創建一個普通賬戶(hadoop),hadoop以此賬戶運行。
2.root安裝jdk
3.修改環境變量
4.安裝hadoop,修改配置文件
5.將虛擬機拷貝2份,分別作為slave1,slave2
6.配置ssh,使兩兩之間,自己登陸自己都免密碼
7.用普通賬戶格式化namenode
8.啟動,并觀察是否正常運行了
注意兩個錯誤:
1.Warning: $HADOOP_HOME is deprecated. 關閉
解決方法:將export HADOOP_HOME_WARN_SUPPRESS=TRUE添加到每個節點的/etc/hadoop/hadoop-env.sh配置文件中。
2.提示不能創建虛擬機錯誤
#[root@master ~]# /usr/bin/start-all.sh
namenode running as process 26878. Stop it first.
slave2: starting datanode, logging to /var/log/hadoop/root/hadoop-root-datanode-slave2.out
slave1: starting datanode, logging to /var/log/hadoop/root/hadoop-root-datanode-slave1.out
slave2: Unrecognized option: -jvm
slave2: Could not create the Java virtual machine.
slave1: Unrecognized option: -jvm
slave1: Could not create the Java virtual machine.
master: secondarynamenode running as process 26009. Stop it first.
jobtracker running as process 25461. Stop it first.
slave2: starting tasktracker, logging to /var/log/hadoop/root/hadoop-root-tasktracker-slave2.out
slave1: starting tasktracker, logging to /var/log/hadoop/root/hadoop-root-tasktracker-slave1.out
解決方法:root不能啟動hadoop,需要用普通賬戶啟動。
------------------------1.設置主機ip地址映射/etc/hosts----------------------------- ------------------------2.添加hadoop用戶,作為運行hadoop的用戶--------------------- ------------------------3.安裝jdk并設置環境變量------------------------------------ [chen@master 桌面]$ su - root 密碼: [root@master ~]# useradd hadoop [root@master ~]# passwd hadoop 更改用戶 hadoop 的密碼 。 新的 密碼: 無效的密碼: 過短 無效的密碼: 過于簡單 重新輸入新的 密碼: passwd: 所有的身份驗證令牌已經成功更新。 [root@master ~]# vim /etc/hosts [root@master ~]# cat /etc/hosts 192.169.1.133 master 192.169.1.134 slave1 192.169.1.135 slave2 [root@master ~]# cd /home/chen/ [root@master chen]# ls hadoop-0.20.205.0-1.i386.rpm 公共的 視頻 文檔 音樂 jdk-6u35-linux-i586-rpm.bin 模板 圖片 下載 桌面 [root@master chen]# chmod 744 jdk-6u35-linux-i586-rpm.bin #給bin執行權限 [root@master chen]# ./jdk-6u35-linux-i586-rpm.bin Unpacking... Checksumming... Extracting... UnZipSFX 5.50 of 17 February 2002, by Info-ZIP (Zip-Bugs@lists.wku.edu). inflating: jdk-6u35-linux-i586.rpm inflating: sun-javadb-common-10.6.2-1.1.i386.rpm inflating: sun-javadb-core-10.6.2-1.1.i386.rpm inflating: sun-javadb-client-10.6.2-1.1.i386.rpm inflating: sun-javadb-demo-10.6.2-1.1.i386.rpm inflating: sun-javadb-docs-10.6.2-1.1.i386.rpm inflating: sun-javadb-javadoc-10.6.2-1.1.i386.rpm Preparing... ########################################### [100%] 1:jdk ########################################### [100%] Unpacking JAR files... rt.jar... jsse.jar... charsets.jar... tools.jar... localedata.jar... plugin.jar... javaws.jar... deploy.jar... Installing JavaDB Preparing... ########################################### [100%] 1:sun-javadb-common ########################################### [ 17%] 2:sun-javadb-core ########################################### [ 33%] 3:sun-javadb-client ########################################### [ 50%] 4:sun-javadb-demo ########################################### [ 67%] 5:sun-javadb-docs ########################################### [ 83%] 6:sun-javadb-javadoc ########################################### [100%] Java(TM) SE Development Kit 6 successfully installed. Product Registration is FREE and includes many benefits: * Notification of new versions, patches, and updates * Special offers on Oracle products, services and training * Access to early releases and documentation Product and system data will be collected. If your configuration supports a browser, the JDK Product Registration form will be presented. If you do not register, none of this information will be saved. You may also register your JDK later by opening the register.html file (located in the JDK installation directory) in a browser. For more information on what data Registration collects and how it is managed and used, see: http://java.sun.com/javase/registration/JDKRegistrationPrivacy.html Press Enter to continue..... Done. [root@master chen]# vim /etc/profile [root@master chen]# ls /usr/java/jdk1.6.0_35/ bin lib register.html THIRDPARTYLICENSEREADME.txt COPYRIGHT LICENSE register_ja.html include man register_zh_CN.html jre README.html src.zip [root@master chen]# tail -3 /etc/profile #設置環境變量 export JAVA_HOME=/usr/java/jdk1.6.0_35 export CLASSPATH=.:$JAVA_HOME/lib:$JAVA_HOME/jre/lib export PATH=$JAVA_HOME/bin:$JAVA_HOME/jre/bin:$PATH [root@master chen]# -----------------------安裝hadoop,修改配置文件----------------------- #這一步完后后,將虛擬機拷貝兩份,分別作為slave1,slave2 #若啟動兩個拷貝后,ip地址和hosts不一樣,要改為實際的ip [root@master chen]# ls hadoop-0.20.205.0-1.i386.rpm 公共的 jdk-6u35-linux-i586.rpm 模板 jdk-6u35-linux-i586-rpm.bin 視頻 sun-javadb-client-10.6.2-1.1.i386.rpm 圖片 sun-javadb-common-10.6.2-1.1.i386.rpm 文檔 sun-javadb-core-10.6.2-1.1.i386.rpm 下載 sun-javadb-demo-10.6.2-1.1.i386.rpm 音樂 sun-javadb-docs-10.6.2-1.1.i386.rpm 桌面 sun-javadb-javadoc-10.6.2-1.1.i386.rpm [root@master chen]# rpm -ivh hadoop-0.20.205.0-1.i386.rpm Preparing... ########################################### [100%] 1:hadoop ########################################### [100%] [root@master chen]# cd /etc/hadoop/ [root@master hadoop]# ls capacity-scheduler.xml hadoop-policy.xml slaves configuration.xsl hdfs-site.xml ssl-client.xml.example core-site.xml log4j.properties ssl-server.xml.example fair-scheduler.xml mapred-queue-acls.xml taskcontroller.cfg hadoop-env.sh mapred-site.xml hadoop-metrics2.properties masters [root@master hadoop]# vim hadoop-env.sh export JAVA_HOME=/usr/java/jdk1.6.0_35 [root@master hadoop]# vim core-site.xml [root@master hadoop]# cat core-site.xml <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>fs.default.name</name> <value>hdfs://master:9000</value> </property> </configuration> [root@master hadoop]# vim hdfs-site.xml [root@master hadoop]# cat hdfs-site.xml <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>dfs.replication</name> <value>2</value> </property> </configuration> [root@master hadoop]# vim mapred-site.xml [root@master hadoop]# cat mapred-site.xml <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>mapred.job.tracker</name> <value>hdfs://master:9001</value> </property> </configuration> [root@master hadoop]# cat masters master [root@master hadoop]# cat slaves slave1 slave2 [root@master hadoop]# ------------------------下面切換到hadoop用戶,設置ssh免密碼登陸------------------------------ [hadoop@master ~]$ ssh-keygen -t dsa #這個步驟在兩個從節點上也要做一遍 Generating public/private dsa key pair. Enter file in which to save the key (/home/hadoop/.ssh/id_dsa): Created directory '/home/hadoop/.ssh'. Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /home/hadoop/.ssh/id_dsa. Your public key has been saved in /home/hadoop/.ssh/id_dsa.pub. The key fingerprint is: 6f:88:68:8a:d6:c7:b0:c7:e2:8b:b7:fa:7b:b4:a1:56 hadoop@master The key's randomart image is: +--[ DSA 1024]----+ | | | | | | | | | S | | . E . o | | . @ + . o | | o.X B . | |ooB*O | +-----------------+ [hadoop@master ~]$ cd .ssh/ [hadoop@master .ssh]$ ls id_dsa id_dsa.pub [hadoop@master .ssh]$ cp id_dsa.pub authorized_keys #必須要將公鑰改名為authorized_keys #編輯authorized_keys文件,將兩個從節點中生成的公鑰id_dsa.pub中的內容拷貝到authorized_keys中 [hadoop@master .ssh]$ vim authorized_keys [hadoop@master .ssh]$ exit logout [chen@master .ssh]$ su - root 密碼: [root@master ~]# cd /home/hadoop/ [root@master hadoop]# ls [root@master hadoop]# cd .ssh/ [root@master .ssh]# ls authorized_keys id_dsa id_dsa.pub #切換到root,將authorized_keys分別拷貝到兩個從節點的/home/hadoop/.ssh下 #這里root拷貝的時候不需要輸入密碼,因為之前也被我設置免密碼了 [root@master .ssh]# scp authorized_keys slave1:/home/hadoop/.ssh/ authorized_keys 100% 1602 1.6KB/s 00:00 [root@master .ssh]# scp authorized_keys slave2:/home/hadoop/.ssh/ authorized_keys 100% 1602 1.6KB/s 00:00 [root@master .ssh]# #這樣拷貝完后,三臺機用hadoop用戶ssh登陸就不需要密碼了, #注意,第一次登陸需要,然后再登陸就不需要了,一定要兩兩之間 #自己登陸自己都走一遍 -------------------------------格式化并啟動hadoop--------------------------- #注意:把三臺機的防火墻都關掉測試。 [hadoop@master ~]$ /usr/bin/hadoop namenode -format 12/09/01 16:52:24 INFO namenode.NameNode: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting NameNode STARTUP_MSG: host = master/192.169.1.133 STARTUP_MSG: args = [-format] STARTUP_MSG: version = 0.20.205.0 STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20-security-205 -r 1179940; compiled by 'hortonfo' on Fri Oct 7 06:19:16 UTC 2011 ************************************************************/ 12/09/01 16:52:24 INFO util.GSet: VM type = 32-bit 12/09/01 16:52:24 INFO util.GSet: 2% max memory = 2.475 MB 12/09/01 16:52:24 INFO util.GSet: capacity = 2^19 = 524288 entries 12/09/01 16:52:24 INFO util.GSet: recommended=524288, actual=524288 12/09/01 16:52:24 INFO namenode.FSNamesystem: fsOwner=hadoop 12/09/01 16:52:24 INFO namenode.FSNamesystem: supergroup=supergroup 12/09/01 16:52:24 INFO namenode.FSNamesystem: isPermissionEnabled=true 12/09/01 16:52:24 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100 12/09/01 16:52:24 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s) 12/09/01 16:52:24 INFO namenode.NameNode: Caching file names occuring more than 10 times 12/09/01 16:52:24 INFO common.Storage: Image file of size 112 saved in 0 seconds. 12/09/01 16:52:25 INFO common.Storage: Storage directory /tmp/hadoop-hadoop/dfs/name has been successfully formatted. 12/09/01 16:52:25 INFO namenode.NameNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at master/192.169.1.133 ************************************************************/ [hadoop@master ~]$ /usr/bin/start-all.sh #啟動信息像下面這樣就正常啟動了 starting namenode, logging to /var/log/hadoop/hadoop/hadoop-hadoop-namenode-master.out slave2: starting datanode, logging to /var/log/hadoop/hadoop/hadoop-hadoop-datanode-slave2.out slave1: starting datanode, logging to /var/log/hadoop/hadoop/hadoop-hadoop-datanode-slave1.out master: starting secondarynamenode, logging to /var/log/hadoop/hadoop/hadoop-hadoop-secondarynamenode-master.out starting jobtracker, logging to /var/log/hadoop/hadoop/hadoop-hadoop-jobtracker-master.out slave2: starting tasktracker, logging to /var/log/hadoop/hadoop/hadoop-hadoop-tasktracker-slave2.out slave1: starting tasktracker, logging to /var/log/hadoop/hadoop/hadoop-hadoop-tasktracker-slave1.out #但是通過查看java的進程信息查不到,可能原因有兩個: #1.防火墻沒關 #2.若防火墻關了還這樣,重啟。 [hadoop@master ~]$ /usr/java/jdk1.6.0_35/bin/jps 28499 Jps [root@master ~]# iptables -F [root@master ~]# exit logout [hadoop@master ~]$ /usr/bin/start-all.sh starting namenode, logging to /var/log/hadoop/hadoop/hadoop-hadoop-namenode-master.out slave2: starting datanode, logging to /var/log/hadoop/hadoop/hadoop-hadoop-datanode-slave2.out slave1: starting datanode, logging to /var/log/hadoop/hadoop/hadoop-hadoop-datanode-slave1.out master: starting secondarynamenode, logging to /var/log/hadoop/hadoop/hadoop-hadoop-secondarynamenode-master.out starting jobtracker, logging to /var/log/hadoop/hadoop/hadoop-hadoop-jobtracker-master.out slave2: starting tasktracker, logging to /var/log/hadoop/hadoop/hadoop-hadoop-tasktracker-slave2.out slave1: starting tasktracker, logging to /var/log/hadoop/hadoop/hadoop-hadoop-tasktracker-slave1.out [hadoop@master ~]$ /usr/java/jdk1.6.0_35/bin/jps 30630 Jps ---------------------------重啟后正常了---------------------- ------------------------master節點--------------------------- [hadoop@master ~]$ /usr/bin/start-all.sh starting namenode, logging to /var/log/hadoop/hadoop/hadoop-hadoop-namenode-master.out slave2: starting datanode, logging to /var/log/hadoop/hadoop/hadoop-hadoop-datanode-slave2.out slave1: starting datanode, logging to /var/log/hadoop/hadoop/hadoop-hadoop-datanode-slave1.out master: starting secondarynamenode, logging to /var/log/hadoop/hadoop/hadoop-hadoop-secondarynamenode-master.out starting jobtracker, logging to /var/log/hadoop/hadoop/hadoop-hadoop-jobtracker-master.out slave2: starting tasktracker, logging to /var/log/hadoop/hadoop/hadoop-hadoop-tasktracker-slave2.out slave1: starting tasktracker, logging to /var/log/hadoop/hadoop/hadoop-hadoop-tasktracker-slave1.out [hadoop@master ~]$ /usr/java/jdk1.6.0_35/bin/jps 3388 JobTracker 3312 SecondaryNameNode 3159 NameNode 3533 Jps ------------------------salve1--------------------------------- [hadoop@master ~]$ ssh slave1 Last login: Sat Sep 1 16:51:48 2012 from slave2 [hadoop@slave1 ~]$ su - root 密碼: [root@slave1 ~]# iptables -F [root@slave1 ~]# setenforce 0 [root@slave1 ~]# exit logout [hadoop@slave1 ~]$ /usr/java/jdk1.6.0_35/bin/jps 3181 TaskTracker 3107 DataNode 3227 Jps --------------------------slave2------------------------------ [hadoop@master ~]$ ssh slave2 Last login: Sat Sep 1 16:52:02 2012 from slave2 [hadoop@slave2 ~]$ su - root 密碼: [root@slave2 ~]# iptables -F [root@slave2 ~]# setenforce 0 [root@slave2 ~]# exit logout [hadoop@slave2 ~]$ /usr/java/jdk1.6.0_35/bin/jps 3165 DataNode 3297 Jps 3241 TaskTracker [hadoop@slave2 ~]$
本文由用戶 jopen 自行上傳分享,僅供網友學習交流。所有權歸原作者,若您的權利被侵害,請聯系管理員。
轉載本站原創文章,請注明出處,并保留原始鏈接、圖片水印。
本站是一個以用戶分享為主的開源技術平臺,歡迎各類分享!