JavaScript must be enabled in order for you to see "WP Copy Data Protect" effect. However, it seems JavaScript is either disabled or not supported by your browser. To see full result of "WP Copy Data Protector", enable JavaScript by changing your browser options, then try again.

Setup the cluster via the Hadoop 2.7.5 under the RHEL 7.4


OMG, I haven’t posted the rubbish over a month, and got the new topic about the cluster of Hadoop under the RHEL 7.4 via the server there are 4 nodes in one chassis. I have met the some problems during building this structure, so want to record these tips into this article. How to build the cluster of Hadoop under the RHEL 7.4, and let’s keep reading:
1) Install the full packages during the OS installation
#rpm -qa | wc -l
2033

2) Install the required packages for compiling the source of Hadoop (Workaround: Finish this step under the CentOS 7.4)
#yum clean all
#yum list && yum install -y epel*
#yum list && yum install -y java* maven cmake pkgconfig rpm-build* protobuf* protobuf-compiler

3) Download the source of Hadoop from the office FTP site
#wget ftp://ftp.twaren.net/Unix/Web/apache/hadoop/common/hadoop-2.7.5/hadoop-2.7.5-src.tar.gz
#tar zxvf hadoop-2.7.5-src.tar.gz
#cd hadoop-2.7.5-src
#mvn package -e -X -Pdist,native -DskipTests -Dtar

4) Copy and setup the folder of Hadoop to the dedicated path for usage
#cp -rf hadoop-dist/target/hadoop-2.7.4/lib/native/* hadoop-dist/target/hadoop-2.7.4/lib/
#cat /etc/bashrc
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.151-5.b12.el7_4.x86_64
export PATH=$PATH:$JAVA_HOME/bin
export HADOOP_HOME=/opt/hadoop-2.7.5
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"
export CLASSPATH=$CLASSPATH:/opt/hadoop-2.7.5/lib/*:.
export HADOOP_OPTS="$HADOOP_OPTS -Djava.security.egd=file:/dev/../dev/urandom"
export PATH=$PATH:$JAVA_HOME/bin
#source /etc/bashrc

5) Setup the basic configuration (SSH key pair exchange)
#vi /etc/hosts
192.168.254.179 NameNode.sit.com NameNode
192.168.253.203 DataNode1.sit.com DataNode1
192.168.253.226 DataNode2.sit.com DataNode2
192.168.254.81 DataNode3.sit.com DataNode3
#ssh-keygen -t rsa
#cp -rf .ssh/id_rsa.pub authorized_keys
#ssh-copy-id -i $HOME/.ssh/id_rsa.pub DataNode1
#ssh-copy-id -i $HOME/.ssh/id_rsa.pub DataNode2
#ssh-copy-id -i $HOME/.ssh/id_rsa.pub DataNode3

6) Setup the configuration about the cluster of Hadoop
#cat /opt/hadoop-2.7.5/etc/hadoop/core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://NameNode:8020/</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
</configuration>
#cat /opt/hadoop-2.7.5/etc/hadoop/hdfs-site.xml
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop_work/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop_work/hdfs/datanode</value>
</property>
<property>
<name>dfs.namenode.checkpoint.dir</name>
<value>file:/usr/local/hadoop_work/hdfs/namesecondary</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.block.size</name>
<value>134217728</value>
</property>
</configuration>
#cat /opt/hadoop-2.7.5/etc/hadoop/mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>NameNode:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>NameNode:19888</value>
</property>
<property>
<name>yarn.app.mapreduce.am.staging-dir</name>
<value>/user/app</value>
</property>
<property>
<name>mapred.child.java.opts</name>
<value>-Djava.security.egd=file:/dev/../dev/urandom</value>
</property>
</configuration>
#cat /opt/hadoop-2.7.5/etc/hadoop/yarn-site.xml
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.resourcemanager.hostname</name>
<value>NameNode</value>
</property>
<property>
<name>yarn.resourcemanager.bind-host</name>
<value>0.0.0.0</value>
</property>
<property>
<name>yarn.nodemanager.bind-host</name>
<value>0.0.0.0</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<property>
<name>yarn.nodemanager.local-dirs</name>
<value>file:/usr/local/hadoop_work/yarn/local</value>
</property>
<property>
<name>yarn.nodemanager.log-dirs</name>
<value>file:/usr/local/hadoop_work/yarn/log</value>
</property>
<property>
<name>yarn.nodemanager.remote-app-log-dir</name>
<value>hdfs://NameNode:8020/var/log/hadoop-yarn/apps</value>
</property>
</configuration>
#cat /opt/hadoop-2.7.5/etc/hadoop/masters
NameNode
#cat /opt/hadoop-2.7.5/etc/hadoop/slaves
DataNode1
DataNode2
DataNode3
#hadoop namenode -format
#start-all.sh
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
Starting namenodes on [NameNode]
NameNode: starting namenode, logging to /opt/hadoop-2.7.5_WZ/logs/hadoop-root-namenode-NameNode.sit.com.out
DataNode1: starting datanode, logging to /opt/hadoop-2.7.5_WZ/logs/hadoop-root-datanode-DataNode1.sit.com.out
DataNode2: starting datanode, logging to /opt/hadoop-2.7.5_WZ/logs/hadoop-root-datanode-DataNode2.sit.com.out
DataNode3: starting datanode, logging to /opt/hadoop-2.7.5_WZ/logs/hadoop-root-datanode-DataNode3.sit.com.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /opt/hadoop-2.7.5_WZ/logs/hadoop-root-secondarynamenode-NameNode.sit.com.out
starting yarn daemons
starting resourcemanager, logging to /opt/hadoop-2.7.5_WZ/logs/yarn-root-resourcemanager-NameNode.sit.com.out
DataNode1: starting nodemanager, logging to /opt/hadoop-2.7.5_WZ/logs/yarn-root-nodemanager-DataNode1.sit.com.out
DataNode3: starting nodemanager, logging to /opt/hadoop-2.7.5_WZ/logs/yarn-root-nodemanager-DataNode3.sit.com.out
DataNode2: starting nodemanager, logging to /opt/hadoop-2.7.5_WZ/logs/yarn-root-nodemanager-DataNode2.sit.com.out
#jps
2113 NameNode
2625 ResourceManager
2378 SecondaryNameNode
2910 Jps
#java -version
openjdk version "1.8.0_151"
OpenJDK Runtime Environment (build 1.8.0_151-b12)
OpenJDK 64-Bit Server VM (build 25.151-b12, mixed mode)
#hadoop dfsadmin -report

7) Trial run the sample command via the calc the Pi
#hadoop jar /opt/hadoop-2.7.5/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.5.jar pi 1 1
Job Finished in 1.29 seconds
Estimated value of Pi is 4.00000000000000000000

8) The infomation for the monitor about the cluster of Hadoop


◎、Above context is about setup the cluster via the Hadoop 2.7.5 under the RHEL 7.4, and I have referred the below links. You also can google it if want to know more info for etc. That’s all for today, and hope this article is helpful for you.See you next time!
* DFSIO Testing Examples – Hadoop DFSIO mapreduce Benchmark Test
* Hadoop中TeraSort算法分析
* Hadoop测试TeraSort
* HDFS benchmark 基准测试
* CentOS7.3编译Hadoop2.7.4
* CentOS 7.3 under the Hadoop 2.7.2 cluster
* 常用的Hadoop测试程序和Benchmark(2.7.1)
* Official Download Website
* Running TeraSort MapReduce Benchmark
* Benchmarking and Stress Testing an Hadoop Cluster With TeraSort, TestDFSIO & Co.
* Hadoop & Big Data benchmarking
* How to Setup Hadoop Multi Node Cluster – Step By Step
* HADOOP 2.6.5 INSTALLING ON UBUNTU 16.04 (SINGLE-NODE CLUSTER)
* Setup Hadoop on Ubuntu (Multi-Node Cluster)

Leave a Comment


NOTE - You can use these HTML tags and attributes:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

 
This site is protected by WP-CopyRightPro