Setting up a Fully Distributed Hadoop Cluster

Wrote by Supun Setunga August 09, 2016 1 Comment

Here i will discuss on how to setup a fully distributed hadoop cluster with 1-master and 2 salves. Here the three nodes are setup in three different machines.

Updating Hostnames

To start off the things, lets first give hostnames to the three nodes. Edit the /etc/hosts file with following command.

sudo gedit /etc/hosts

Add following hostname and against the ip addresses of all three nodes. Do this for the all three nodes.

192.168.2.14    hadoop.master
192.168.2.15    hadoop.slave.1
192.168.2.15    hadoop.slave.2

Once you do that, update the /etc/hostname file to include hadoop.master/hadoop.slave.1/hadoop.slave.2 as the hostname of each of the machines respectively.

Optional:

For security concerns, one might prefer to have a separate user for Hadoop. In order to create a separate user execute the following command in the terminal:

sudo addgroup hadoop
sudo adduser --ingroup hadoop hduser

Give a desired password..

Then restart the machine.

sudo reboot

Install SSH

Hadoop needs to copy files between the nodes. For that it should be able to acces each node with ssh, without having to give username/password. Therefore, first we need to install ssh client and server.

sudo apt install openssh-client
sudo apt install openssh-server

Generate a key

ssh-keygen -t rsa -b 4096

Copy the key for each node

ssh-copy-id -i $HOME/.ssh/id_rsa.pub hduser@hadoop.master
ssh-copy-id -i $HOME/.ssh/id_rsa.pub hduser@hadoop.slave.1
ssh-copy-id -i $HOME/.ssh/id_rsa.pub hduser@hadoop.slave.2

Try sshing to all the nodes. eg:

ssh hadoop.slave.1

You should be able to ssh to all the nodes, without proving the user credentials. Repeat this step in all three nodes.

Configuring Hadoop

To configure hadoop, change the following configurations:

Define hadoop master url in <hadoop_home>/etc/hadoop/core-site.xml , in all nodes.

<property>
  <name>fs.default.name</name>
  <value>hdfs://hadoop.master:9000</value>
</property>

Create two directories /home/wso2/Desktop/hadoop/localDirs/name and /home/wso2/Desktop/hadoop/localDirs/data (and make hduser the owner, if you create a separate user for hadop) . Give read/write rights to that folder.

Modify the <hadoop_home>/etc/hadoop/hdfs-site.xml as follows, in all nodes.

<property>
  <name>dfs.replication</name>
  <value>3</value>
</property>
<property>
  <name>dfs.name.dir</name>
  <value>/home/wso2/Desktop/hadoop/localDirs/name</value>
</property>
<property>
  <name>dfs.data.dir</name>
  <value>/home/wso2/Desktop/hadoop/localDirs/data</value>
</property>

<hadoop_home>/etc/hadoop/mapred-site.xml (all nodes)

<property>
  <name>mapreduce.job.tracker</name>
  <value>HadoopMaster:5431</value>
</property>

Add the hostname of master node, to <hadoop_home>/etc/hadoop/masters file, in all nodes.

hadoop.master

Add hostname of slave nodes to <hadoop_home>/etc/hadoop/slaves file, in all nodes.

hadoop.slave.1
hadoop.slave.2

(Only in Master) We need to format the namenodes, before we start hadoop. For that, in the master node, navigate to <hadoop_home>/etc/hadoop/bin/ directory and execute the following.

./hdfs namenode -format

Finally, start the hadoop server, by navigating to <hadoop_home>/etc/hadoop/sbin/ directory, and execute the following:

./start-dfs.sh

If everything goes well, hdfs should be started. And you can browse the webUI of the namenode from the URL: http://localhost:50070/dfshealth.jsp.

Tags: cluster hadoop hdfs

1 comments

TriamFebruary 25, 2019 at 8:44 AM
This message is great. phone girls London
ReplyDelete
Replies

Add comment

Setting up a Fully Distributed Hadoop Cluster

Updating Hostnames

Optional:

Install SSH

Configuring Hadoop

Share:

1 comments

About Me

My Tech World

Blog Archive

Popular Posts

Like us on Facebook

Search This Blog

Labels

Report Abuse

Most Popular

Pages

FOLLOW US @ INSTAGRAM

Featured

JSON Manipulation with Ballerina

Looped Slider

FOLLOW US @ INSTAGRAM

Setting up a Fully Distributed Hadoop Cluster

Updating Hostnames

Optional:

Install SSH

Configuring Hadoop

Share:

Related Articles

1 comments

About Me

My Tech World

Blog Archive

Popular Posts

Like us on Facebook

Search This Blog

Labels

Report Abuse

Most Popular

Pages

FOLLOW US @ INSTAGRAM

Featured

JSON Manipulation with Ballerina

Looped Slider

FOLLOW US @ INSTAGRAM