How To Setup Hadoop Cluster On AWS EC2

What is Hadoop?

What is AWS?

Amazon Web Services (AWS) is a subsidiary of Amazon providing on-demand cloud computing platforms and APIs to individuals, companies, and governments, on a metered pay-as-you-go basis. These cloud computing web services provide a variety of basic abstract technical infrastructure and distributed computing building blocks and tools. One of these services is Amazon Elastic Compute Cloud (EC2), which allows users to have at their disposal a virtual cluster of computers, available all the time, through the Internet.

Let’s create Cluster:

✅ Launch one for name node and as much you want for DataNodes.

Step1: Launch Instances for cluster

Security Group- Rules

Step2: Transfer Software to an instance


Step3: Install Software java and Hadoop.

Step 4: Configure one instance as name-node and the other as DataNode.

Create a new directory for name node and data node where that store data.

mkdir /nn
mkdir /dn
hdfs-site.xml and core-site.xml
#Run this command to format namenode and start service
hadoop namenode -format
hadoop-deamon start namenode
hdfs-site.xml and core-site.xml start datanode
#This command show you count of datanode connected to namenode and more information
hadoop dfsadmin -report

Aspiring Cloud DevOps Engineer