Integrating LVM with Hadoop and Providing Elasticity to Datanode & Automating LVM using Python-script

Neha sonone
6 min readNov 17, 2020

--

  • Task I s to integrate LVM with Hadoop and providing the elasticity to datanode .
  • increase or decrease the size of the static partition in Linux
  • Automate LVM partition using python-script.

what is LVM?

LVM is logical volume management, it is like dynamic partition meaning that you can create/resize/delete LVM “partitions” (they’re called “Logical Volumes” in LVM-speak) from the command line while your Linux system is running

suppose we need 10GB storage, but we have one 4GB Pendrive and 8GB Pendrive, by this LVM we can combine the two storage and can make it a single one.

Elasticity

Elasticity is just a concept which we use to increase or decrease the size of storage we want to contribute to namenode it can be achieved by the LVM. Let's see How?

Setting up the Hadoop cluster

  • first configure the namenode. go to cd /etc/Hadoop folder , and edit the hdfs-site.xml file by command
vim hdfs-site.xml
  • after that configure the core-site.xml file.
  • after the namenode configuration done, we have to configure the datanode by going to the same files as earlier.

configure hdfs-site.xml — for datanode.

configure the core-site.xml for datanode. in this file we add the IP of namenode.

  • now start the service of namenode and datanode by command.
hadoop-daemon.sh start namenodehadoop-daemon.sh start datanode
  • after the datanode service starts , in the namenode VM run the command hadoop dfsadmin -report to check whether the datanode connected or not.
  • it is connected and in my case the datanode is contributing about 8GB storage to the namenode, the goal is to integrate LVM with hadoop so for that we have to create partition .attaching volume to datanode.

here I'm creating volume of 9GB

I'm creating another volume of 6GB in the same way.

  • now to see the attached volume or the detailed we have command
fdisk  -l
  • Now we have attached two volumes One is 9GB in size and another is 6GB in size , we have to create physical volumes for both.by the command.
pvcreate  /dev/sdc
pvdisplay /dev/sdc

creating pv for First one /dev/sdc of 9GB.

now in the same way creating physical volume for second one /dev/sde.

  • Now we have to create volume group for both the physical volumes.
vgcreate V_group  /dev/sdc /dev/sde      -> To create
vgdisplay V_group -> To display

Now u can see volume group of 14.99GB has been created successfully.

  • Here I'm creating a logical volume of size 4GB from the above created volume group.
lvcreate --size 4G --name  lv1 V_group

logical volume of size 4GB has been created succesfully.

  • Now we have to format the created logical volume.
mkfs.ext4
  • Now we have to mount this logical volume ,so for that first create a directory to mount this lv.
mkdir  /link
  • to check the logical volume mounted or not.
  • Now update the directory of hdfs-site.xml file to this new directory /link
  • Now start the service of datanode
  • In namenode run the command hadoop dfsadmin -report to check how much storage datanode is contributing to namenode.

Now datanode is only contributing 4GB to the namenode.

Increasing the Logical volume:-

  • To increase the logical volume size
lvextend --size +1G  /dev/V_group/lv1

here i have increased the size by 1GB .

now we have to format the extended part of 1GB so for this we use the command

resize2fs  /dev/V_group/lv1

now u can see below the increased size.

  • now again u can see the size how much datanode is contributing to namenode

Decreasing the Logical volume:-

  • Simillarly we can also reduce the logical volume size
lvreduce -L 1G /dev/V_group/lv1

We have reduced the size to 1GB .

Automating LVM Partition using Python-script.

here i have automated the lvm partition using python below u can find the github link

Thank you for reading :)

--

--

Neha sonone
Neha sonone

Responses (1)