Setting up a Hadoop File System in Local Machine
In this article I will be describing how to setup a hadoop file system in the local machine to run in Pseudo-Distributed mode. First download hadoop form here.
Then extract it to any preffered location. Now we need to send the environment variables to this extracted location. For that open open ~/.bashrc file and add the following to two separate lines.
export HADOOP_HOME=/home/supun/Supun/Softwares/hadoop-2.2.0
export PATH=/home/supun/Supun/Softwares/hadoop-2.2.0/bin:$PATH
Where /home/supun/Supun/Softwares/hadoop-2.2.0 is the location of my hadoop file was extracted. I will be refering this location as HADOOP_HOME from here onwards.
Now we need to make some small configurations for the following files. Open each of the file and add the following to them.
HADOOP_HOME/conf/core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
Configuring:
Now we need to make some small configurations for the following files. Open each of the file and add the following to them.
HADOOP_HOME/conf/core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
HADOOP_HOME/conf/hdfs-site.xml
<configuration>
HADOOP_HOME/conf/hdfs-site.xml
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
HADOOP_HOME/conf/mapred-site.xml
HADOOP_HOME/conf/mapred-site.xml
<configuration>
<property>
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
</property>
</configuration>
Optional:
sudo apt-get install ssh
Now if the software is installed, try the following command (ubuntu) to check whether ssh can access the localhost without a password.
ssh localhost
If this asks for a password (local machine's user's password), then execute the following.
ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
If this asks for a password (local machine's user's password), then execute the following.
ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
Start in Pseudo-Distributed mode
Before start the hdfs, you need to format the namenode. For that, navigate to HADOOP_HOME/bin directory and execute the following.
hdfs namenode -format
Then start the hdfs by navigateing to HADOOP_HOME/sbin and executing the following.
./start-dfs.sh
Or if it didn't work, try executing "./start-dfs.sh -upgrade", instead of above.
If everything goes well, hdfs should be started. And you can browse the webUI of the name node from the URL: http://localhost:50070/dfshealth.jsp. Please refer [1] for further details on setting up hdfs in different modes.
Now you can use hadoop shell commands to manage files in this hdfs. Refer [2] and [3] for Hadoop Commands.
10 comments
There are lots of information about hadoop have spread around the web, but this is a unique one according to me. The strategy you have updated here will make me to get to the next level in big data. Thanks for sharing this.
ReplyDeleteHadoop training in Tambaram
Hadoop course in Tambaram
I was just wondering how I missed this article so far, this is a great piece of content I have ever seen in the entire Internet. Thanks for sharing this worth able information in here and do keep blogging like this.
ReplyDeleteHadoop Training Chennai | Big Data Training in Chennai | Big Data Training Chennai
Your blog has given me that thing which I never expect to get from all over the websites. Nice post guys!
ReplyDeleteWeb Developer Melbourne
I believe there are many more pleasurable opportunities ahead for individuals that looked at your site.
ReplyDeletegoogle-cloud-platform-training-in-chennai
Awesome Blog with Smart Content
ReplyDeleteHadoop training in Hyderabad
This comment has been removed by the author.
ReplyDeleteIt is amazing that you share your knowledge with us.
ReplyDeleteGrace
I don't have the time at the moment to fully read your site but I have bookmarked it and also add your feeds.
ReplyDeleteSelenium Training in Chennai
Selenium Training
iOS Training in Chennai
French Classes in Chennai
Big Data Training in Chennai
web designing course in chennai
web designing training in chennai
I gathered a lot of information through this article.Every example is easy to undestandable and explaining the logic easily.google cloud platform training in bangalore
ReplyDeleteIt is truly supportive for us and I have accumulated some essential data from this blog.
ReplyDeleteBig Data Hadoop Training In Chennai | Big Data Hadoop Training In anna nagar | Big Data Hadoop Training In omr | Big Data Hadoop Training In porur | Big Data Hadoop Training In tambaram | Big Data Hadoop Training In velachery