Integration of LVM with Hadoop Cluster

TAMANNA VERMA
5 min readMar 14, 2021

Our ultimate goal is to increase the logical volume to increase the size oof hard disk on the fly.

First of all ,we should know what is LVM, where it is used & why it is so much useful to the industry point of view?

What Is LVM ?

  • LVM basically stands for Logical Volume Management.It’s a system of managing logical volumes or file systems that is more advanced and flexible than the traditional method of partitioning the disk into one or more segments.
  • LVM is a tool for logical volume management which includes allocating disks, striping and resizing the logical volumes etc.
  • It is a abstraction layer between your operating system and physical hard drives means if you are using LVM your physical hard drive & partitions will be no longer tied together.

Where it is used ?

Whenever a company needs to change the size of the hard disk in its server on the fly/online(means without stopping services) then there they use the concept of LVM.

Why it is so much useful ?

Because it makes the partitions dynamic means their size can be changed anytime by creating physical volumes. It provides more flexibility & extended facilities to the disks.

Steps we have to do :

⏩ Add Virtual Hard Disks

⏩ Create Physical Volumes(PV)

⏩ Create Volume Group(VG)

⏩ Create Logical Volume(LV)

⏩ Mount & Format the Logical Volume

⏩ Start Datanode & Namenode

⏩ Extend Size of Logical Volume on the Fly

https://www.sysonion.de/centos-logical-volume-management-lvm/

Let’s move to the practical part, to perform this practical I am going to use Linux Redhat8 on the top of Oracle Virtual Box.

Add Virtual Hard Disks

  • I have attached two virtual hard disks to my VM i.e. rhel_slave_1_3.vdi & rhel_slave_1_2.vdi.
  • Use command lsblk to see attached hard disks & their name.

Create Physical Volume(PV)

  • Use command pvcreate <disk_name> to create physical volumes.
  • To see physical volumes created use command pvdisplay <disk_name>. Size of physical volumes are 5 GB & 8 GB and they are not allocated yet.

Create Volume Group(VG)

  • Use command vgcreate <vg_name> <disk_name_1> <disk_name_2> to create volume group.
  • Use command vgdisplay <vg_name> to see volume group created having total size 12.99 GB & not allocated yet.
  • We can see both the physical volumes are allocated now.

Create Logical Volume(LV)

  • Use command lvcreate — size <7G> — name <lv_name> <vg_name> to create the logical volumes or partitions.
  • Now we can see 7 GB is allocated from the volume group & 5.99 GB is still free.
  • To see logical volume created use command lvdisplay <vg_name>/<lv_name>.

Mount & Format the Logical Volume(LV)

  • To format the logical volume use command mkfs.ext4 /dev/<vg_name>/lv_name>.
  • To mount the logical volume create a directory first using mkdir command. Then use command mount /dev/<vg_name>/lv_name> /directory.
  • Check either logical volume mounted or not using lsblk command. In my case it is mounted successfully.

Start Datanode & Namenode

  • Configure hdfs-site.xml with your mount point(i.e. /l1) & start the process using command hadoop-daemon.sh start datanode. Use jps command to check datanode is configured properly or not.
  • Similarly start your namenode using command hadoop-daemon.sh start namenode.
  • Check contribution of storage of datanode(i.e. 6.83 GB) using hadoop dfsadmin -report.

Extend Size of Logical Volume on the Fly

  • Use lvextend — size <+4G> /dev/<vg_name>/<lv_name> command to increase the 4 GB size of logical volume on the fly.
  • Using vgdisplay <vg_name> command you can check now the allocated size is 11 GB which was 7 GB before this.
  • Also through lvdisplay /dev/<vg_name>/<lv_name> command we can check now the size of logical volume has become 11 GB.
  • But df -h command is showing still size approx 7 GB size. Why?

Because the size we have increased (4 GB) is not formatted yet. And df -h can show only the formatted size. So we have to format that size too.

  • To format the extended size we use command :

resize2fs /dev/<vg_name>/<lv_name>

  • Now we can check df -h is also showing size 11 GB.
  • Thus, finally we have increased the amount of storage contributed by datanode to namenode. Earlier it was approx 7 GB but now it is almost 11 GB.

--

--