Integrate LVM with Hadoop Cluster

Providing Elasticity to DataNode Storage.

Rohit Raut
4 min readMar 8, 2021

In this article, we are going to integrate the LVM concept with Hadoop Cluster and will show you how to provide elasticity to DataNode Storage.

Before we start, we need to understand what is LVM and about Hadoop Cluster.

LVM

Logical Volume Management is an abstraction over physical storage. We don't need to worry about storage because with the help of LVM we can increase the size of the partition on the fly without losing data. In the case of reducing size, it is conditional if a partition is occupied with data then you lose the data.

Hadoop Cluster

The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Hadoop is one of the framework that helps to solve problems of big data.

There are two terms over here

  1. NameNode: It is a master node in the Hadoop cluster. It maintains all metadata about the data node and the content stored in the data node in its file system
  2. DataNode: It is main storage server where namenode give instruction to cluster user to send data. All data is stored in DataNode as the name suggests and managed by NameNode.

๐Ÿฑ๐˜ƒโ€™๐˜€ ๐—ผ๐—ณ ๐—•๐—ถ๐—ด ๐——๐—ฎ๐˜๐—ฎ

To set up Hadoop Cluster

Now Letโ€™s get started

Step 1: Attach Hard Disk to Virtual Machine and To see dist run the following command

fdisk -l

Step 2: Convert the hard disk to Physical Volume

pvcreate command initializes these disks so that they can be a part of forming volume groups.

To see this created PV

pvdisplay

Step 3: Create a volume group

Physical volumes are combined into volume groups (VGs). It creates a pool of disk space out of which logical volumes can be allocated.

Step4: Create Logical Volume

It takes the space from the volume group and creates a partition of the capacity mentioned in the command

To display logical volume

Step 5: Format the logical Partition

Step 6: Create a directory for data node storage and mount logical volume

Step 7: To see storage gain by Hadoop cluster

Step 8: Extend the logical volume

here the size of the partition is increased by 8G but the file system still is of older 4G. To resize the file system

Step 9: Resize the file system

Step 10: Hadoop cluster report

after resizing the lv partition we don't have to do anything Hadoop automatically picks the updated size.

Thatโ€™s all for this article. Thanks For Reading๐Ÿ˜Š

--

--

Rohit Raut
Rohit Raut

No responses yet