Integrate LVM with Hadoop Cluster
In this article, we are going to integrate the LVM concept with Hadoop Cluster and will show you how to provide elasticity to DataNode Storage.
Before we start, we need to understand what is LVM and about Hadoop Cluster.
Logical Volume Management is an abstraction over physical storage. We don't need to worry about storage because with the help of LVM we can increase the size of the partition on the fly without losing data. In the case of reducing size, it is conditional if a partition is occupied with data then you lose the data.
The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Hadoop is one of the framework that helps to solve problems of big data.
There are two terms over here
- NameNode: It is a master node in the Hadoop cluster. It maintains all metadata about the data node and the content stored in the data node in its file system
- DataNode: It is main storage server where namenode give instruction to cluster user to send data. All data is stored in DataNode as the name suggests and managed by NameNode.
𝟱𝘃’𝘀 𝗼𝗳 𝗕𝗶𝗴 𝗗𝗮𝘁𝗮
To set up Hadoop Cluster
Now Let’s get started
Step 1: Attach Hard Disk to Virtual Machine and To see dist run the following command
Step 2: Convert the hard disk to Physical Volume
pvcreate command initializes these disks so that they can be a part of forming volume groups.
To see this created PV
Step 3: Create a volume group
Physical volumes are combined into volume groups (VGs). It creates a pool of disk space out of which logical volumes can be allocated.
Step4: Create Logical Volume
It takes the space from the volume group and creates a partition of the capacity mentioned in the command
To display logical volume
Step 5: Format the logical Partition
Step 6: Create a directory for data node storage and mount logical volume
Step 7: To see storage gain by Hadoop cluster
Step 8: Extend the logical volume
here the size of the partition is increased by 8G but the file system still is of older 4G. To resize the file system
Step 9: Resize the file system
Step 10: Hadoop cluster report
after resizing the lv partition we don't have to do anything Hadoop automatically picks the updated size.