mirror of https://github.com/tLDP/LDP
new entry
This commit is contained in:
parent
888e31004c
commit
27c60d6b21
|
@ -0,0 +1,951 @@
|
|||
<!doctype linuxdoc system>
|
||||
|
||||
<!--
|
||||
$Id$
|
||||
-->
|
||||
|
||||
<article>
|
||||
<title>Logical Volume Manager HOWTO</title>
|
||||
<author>bert hubert <ahu@ds9a.nl>&nl;
|
||||
Richard Allen <ra@ra.is></author>
|
||||
<date>Version 0.0.2 $Date$</date>
|
||||
<abstract>
|
||||
A very hands-on HOWTO for Linux LVM
|
||||
</abstract>
|
||||
|
||||
|
||||
<!-- Table of contents -->
|
||||
<toc>
|
||||
|
||||
<!-- Begin the document -->
|
||||
|
||||
<sect>Introduction
|
||||
<p>
|
||||
Welcome, gentle reader.
|
||||
|
||||
This document is written to help enlighten you on what LVM is, how it works,
|
||||
and how you can use it to make your life easier. While there is an LVM
|
||||
FAQ, and even a German HOWTO, this document is written from a different
|
||||
perspective. It is a true 'HOWTO' in that it is very hands-on, while also
|
||||
imparting understanding (hopefully).
|
||||
|
||||
I should make it clear that I am not an author of the Linux Logical Volume
|
||||
Manager. I have great respect for the people who are, and hope to be able to
|
||||
cooperate with them.
|
||||
|
||||
It's even weirder, I don't even know the developers of LVM. I hope this will
|
||||
change soon. I apologise in advance for stepping on peoples toes.
|
||||
<sect1>Disclaimer & License
|
||||
<p>
|
||||
This document is distributed in the hope that it will be useful,
|
||||
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
|
||||
|
||||
If your disks melt and your company fires you - it's never our fault. Sorry.
|
||||
Make frequent backups and do your experiments on non-mission critical
|
||||
systems.
|
||||
|
||||
Furthermore, Richard Allen does not speak for his employer.
|
||||
|
||||
Linux is a registered trademark of Linus Torvalds.
|
||||
<sect1>Prior knowledge
|
||||
<p>
|
||||
Not much. If you have ever installed Linux and made a filesystem
|
||||
(fdisk/mkfs), you should be all set. As always when operating as root,
|
||||
caution is advised. Incorrect commands or any operation on device files
|
||||
may damage your existing data.
|
||||
|
||||
If you know how to configure HP/UX LVM you are almost done, Linux works
|
||||
almost exactly like the HP implementation.
|
||||
<sect1>Housekeeping notes
|
||||
<p>
|
||||
There are several things which should be noted about this document. While I
|
||||
wrote most of it, I really don't want it to stay that way. I am a strong
|
||||
believer in Open Source, so I encourage you to send feedback, updates,
|
||||
patches etcetera. Do not hesitate to inform us of typos or plain old errors.
|
||||
|
||||
If you feel to you are better qualified to maintain a section, or think that
|
||||
you can author and maintain new sections, you are welcome to do so. The SGML
|
||||
of this HOWTO is available via CVS. I envision this being a collaborative
|
||||
project.
|
||||
|
||||
In aid of this, you will find lots of FIXME notices. Patches are always
|
||||
welcome! Wherever you find a FIXME, you should know that you are treading
|
||||
unknown territory. This is not to say that there are no errors elsewhere,
|
||||
but be extra careful. If you have validated something, please let us know so
|
||||
I can remove the FIXME notice.
|
||||
|
||||
<sect1>Access, CVS & submitting updates
|
||||
<p>
|
||||
The canonical location for the HOWTO is
|
||||
<url url="http://www.ds9a.nl/lvm-howto/"
|
||||
name="http://www.ds9a.nl/lvm-howto/">.
|
||||
|
||||
We now have anonymous CVS access available for the world at large. This
|
||||
allows you to easily obtain the latest version of this HOWTO and to
|
||||
provide your changes and enhancements.
|
||||
|
||||
If you want to grab a copy of the HOWTO via CVS, here is how to do so:
|
||||
<tscreen><verb>
|
||||
$ export CVSROOT=:pserver:anon@outpost.ds9a.nl:/var/cvsroot
|
||||
$ cvs login
|
||||
CVS password: [enter 'cvs' (without 's)]
|
||||
$ cvs co lvm-howto
|
||||
cvs server: Updating lvm-howto
|
||||
U lvm-howto/lvm-howto.sgml
|
||||
</verb></tscreen>
|
||||
|
||||
If you spot an error, or want to add something, just fix it locally, and run
|
||||
"cvs diff -u", and send the result off to us.
|
||||
|
||||
A Makefile is supplied which should help you create postscript, dvi, pdf,
|
||||
html and plain text. You may need to install sgml-tools, ghostscript and
|
||||
tetex to get all formats.
|
||||
|
||||
<sect1>Layout of this document
|
||||
<p>
|
||||
We will initially be explaining some basic stuff which is needed to do
|
||||
things. We do try however to include examples where this would aid
|
||||
comprehension.
|
||||
|
||||
<sect>What is LVM?
|
||||
<p>
|
||||
Historically, a partition size is static. This requires a system installer
|
||||
to have to consider not the question of "how much data will I store
|
||||
on this partition", but rather "how much data will I *EVER* store on
|
||||
this partition". When a user runs out of space on a partition, they
|
||||
either have to re-partition (which may involve an entire operating
|
||||
system reload) or use kludges such as symbolic links.
|
||||
|
||||
The notion that a partition was a sequential series of blocks on a
|
||||
physical disc has since evolved. Most Unix-like systems now have
|
||||
the ability to break up physical discs into some number of units.
|
||||
Storage units from multiple drives can be pooled into a "logical
|
||||
volume", where they can be allocated to partitions. Additionally,
|
||||
units can be added or removed from partitions as space requirements
|
||||
change.
|
||||
|
||||
This is the basis of a Logical Volume Manager (LVM).
|
||||
|
||||
For example, say that you have a 1GB disc and you create the "/home"
|
||||
partition using 600MB. Imagine that you run out of space and decide
|
||||
that you need 1GB in "/home". Using the old notion of partitions,
|
||||
you'd have to have another drive at least 1GB in size. You could then
|
||||
add the disc, create a new /home, and copy the existing data over.
|
||||
|
||||
However, with an LVM setup, you could simply add a 400MB (or larger)
|
||||
disc, and add it's storage units to the "/home" partition. Other
|
||||
tools allow you to resize an existing file-system, so you simply
|
||||
resize it to take advantage of the larger partition size and you're
|
||||
back in business.
|
||||
|
||||
As a very special treat, LVM can even make 'snapshots' of itself which
|
||||
enable you to make backups of a non-moving target. We return to this
|
||||
exciting possibility, which has lots of other real-world applications, later
|
||||
on.
|
||||
|
||||
In the next section we explain the basics of LVM, and the multitude of
|
||||
abstractions it uses.
|
||||
|
||||
<sect>Basic principles
|
||||
<p>
|
||||
Ok, don't let this scare you off, but LVM comes with a lot of jargon which
|
||||
you should understand lest you endanger your filesystems.
|
||||
|
||||
We start at the bottom, more or less.
|
||||
|
||||
<descrip>
|
||||
<tag>The physical media</tag>
|
||||
You should take the word 'physical' with a grain of salt, though we will
|
||||
initially assume it to be a simple hard disk, or a partition. Examples,
|
||||
/dev/hda, /dev/hda6, /dev/sda. You can turn any consecutive number of blocks
|
||||
on a block device into a ...
|
||||
<tag>Physical Volume (PV)</tag>
|
||||
A PV is nothing more than a physical medium with some administrative data
|
||||
added to it - once you have added this, LVM will recognise it as a holder
|
||||
of ...
|
||||
<tag>Physical Extents (PE)</tag>
|
||||
Physical Extents are like really big blocks, often with a size of megabytes.
|
||||
PEs can be assigned to a...
|
||||
<tag>Volume Group</tag>
|
||||
A VG is made up of a number of Physical Extents (which may have come from
|
||||
multiple Physical Volumes or hard drives). While it may be tempting to
|
||||
think of a VG as being made up of several hard drives (/dev/hda and /dev/sda
|
||||
for example), it's more accurate to say that it contains PEs which are provided
|
||||
by these hard drives.
|
||||
|
||||
>From this Volume Group, PEs can be assigned to a ...
|
||||
<tag>Logical Volume (LV)</tag>
|
||||
Yes, we're finally getting somewhere. A Logical Volume is the end result of
|
||||
our work, and it's there that we store our information. This is equivalent to
|
||||
the historic idea of partitions.
|
||||
|
||||
As with a regular partition, on this Logical Volume you would typically build
|
||||
a ...
|
||||
<tag>Filesystem</tag>
|
||||
This filesystem is whatever you want it to be: the standard ext2,
|
||||
ReiserFS, NWFS, XFS, JFX, NTFS, etc... To the linux kernel, there is
|
||||
no difference between a regular partition and a Logical Volume.
|
||||
</descrip>
|
||||
|
||||
I've attempted some ASCII art which may help you visualise this.
|
||||
|
||||
<verb>
|
||||
A Physical Volume, containing Physical Extents:
|
||||
|
||||
+-----[ Physical Volume ]------+
|
||||
| PE | PE | PE | PE | PE | PE |
|
||||
+------------------------------+
|
||||
|
||||
A Volume Group, containing 2 Physical Volumes (PVs) with 6 Physical Extents:
|
||||
|
||||
+------[ Volume Group ]-----------------+
|
||||
| +--[PV]--------+ +--[PV]---------+ |
|
||||
| | PE | PE | PE | | PE | PE | PE | |
|
||||
| +--------------+ +---------------+ |
|
||||
+---------------------------------------+
|
||||
|
||||
We now further expand this:
|
||||
|
||||
+------[ Volume Group ]-----------------+
|
||||
| +--[PV]--------+ +--[PV]---------+ |
|
||||
| | PE | PE | PE | | PE | PE | PE | |
|
||||
| +--+---+---+---+ +-+----+----+---+ |
|
||||
| | | | +-----/ | | |
|
||||
| | | | | | | |
|
||||
| +-+---+---+-+ +----+----+--+ |
|
||||
| | Logical | | Logical | |
|
||||
| | Volume | | Volume | |
|
||||
| | | | | |
|
||||
| | /home | | /var | |
|
||||
| +-----------+ +------------+ |
|
||||
+---------------------------------------+
|
||||
</verb>
|
||||
|
||||
This shows us two filesystems, spanning two disks. The /home filesystem
|
||||
contains 4 Physical Extents, the /var filesystem 2.
|
||||
|
||||
bert hubert is writing <url name="a tool" url="http://ds9a.nl/lvm-viewer"> to
|
||||
represent LVM more visually, a <url name="screenshot"
|
||||
url="http://ds9a.nl/lvm-howto/screenshot.gif"> is provided. Looks better
|
||||
than the ASCII art.
|
||||
|
||||
<sect1>Show & Tell
|
||||
<p>
|
||||
Ok, this stuff is hard to assimilate ('We are LVM of Borg...'), so here is a
|
||||
very annotated example of creating a Logical Volume. Do NOT paste this
|
||||
example onto your console because you WILL destroy data, unless it happens
|
||||
that on your computer /dev/hda3 and /dev/hdb2 aren't used.
|
||||
|
||||
When in doubt, view the ASCIIgram above.
|
||||
|
||||
You should first set the partition types of /dev/hda3 and /dev/hdb2 to 0x8e,
|
||||
which is 'Linux LVM'. Please note that your version of fdisk may not yet know
|
||||
this type, so it will be listed as 'Unknown':
|
||||
|
||||
|
||||
<tscreen><verb>
|
||||
# fdisk /dev/hda
|
||||
|
||||
Command (m for help): p
|
||||
|
||||
Disk /dev/hda: 255 heads, 63 sectors, 623 cylinders
|
||||
Units = cylinders of 16065 * 512 bytes
|
||||
|
||||
Device Boot Start End Blocks Id System
|
||||
/dev/hda1 1 2 16033+ 83 Linux
|
||||
/dev/hda2 3 600 4803435 83 Linux
|
||||
/dev/hda3 601 607 56227+ 83 Linux
|
||||
/dev/hda4 608 614 56227+ 83 Linux
|
||||
|
||||
Command (m for help): t
|
||||
Partition number (1-4): 3
|
||||
Hex code (type L to list codes): 8e
|
||||
|
||||
Command (m for help): p
|
||||
|
||||
Disk /dev/hda: 255 heads, 63 sectors, 623 cylinders
|
||||
Units = cylinders of 16065 * 512 bytes
|
||||
|
||||
Device Boot Start End Blocks Id System
|
||||
/dev/hda1 1 2 16033+ 83 Linux
|
||||
/dev/hda2 3 600 4803435 83 Linux
|
||||
/dev/hda3 601 607 56227+ 8e Unknown
|
||||
/dev/hda4 608 614 56227+ 83 Linux
|
||||
|
||||
Command (m for help): w
|
||||
</verb></tscreen>
|
||||
|
||||
We do the same for /dev/hdb2, but we don't display it here. This is needed
|
||||
so that LVM is able to reconstruct things should you lose your
|
||||
configuration.
|
||||
|
||||
Now, this shouldn't be necessary, but some computers require a reboot at
|
||||
this point. So if the following examples don't work, try rebooting.
|
||||
|
||||
We then create our Physical Volumes, like this:
|
||||
<tscreen><verb>
|
||||
# pvcreate /dev/hda3
|
||||
pvcreate -- physical volume "/dev/hda3" successfully created
|
||||
# pvcreate /dev/hdb2
|
||||
pvcreate -- physical volume "/dev/hdb2" successfully created
|
||||
</verb></tscreen>
|
||||
|
||||
We than add these two PVs to a Volume Group called 'test':
|
||||
<tscreen><verb>
|
||||
# vgcreate test /dev/hdb2 /dev/hda3
|
||||
vgcreate -- INFO: using default physical extent size 4 MB
|
||||
vgcreate -- INFO: maximum logical volume size is 255.99 Gigabyte
|
||||
vgcreate -- doing automatic backup of volume group "test"
|
||||
vgcreate -- volume group "test" successfully created and activated
|
||||
</verb></tscreen>
|
||||
|
||||
So we now have an empty Volume Group, let's examine it a bit:
|
||||
|
||||
<tscreen><verb>
|
||||
# vgdisplay -v test
|
||||
--- Volume group ---
|
||||
VG Name test
|
||||
VG Access read/write
|
||||
VG Status available/resizable
|
||||
VG # 0
|
||||
MAX LV 256
|
||||
Cur LV 0
|
||||
Open LV 0
|
||||
MAX LV Size 255.99 GB
|
||||
Max PV 256
|
||||
Cur PV 2
|
||||
Act PV 2
|
||||
VG Size 184 MB
|
||||
PE Size 4 MB
|
||||
Total PE 46
|
||||
Alloc PE / Size 0 / 0
|
||||
Free PE / Size 46 / 184 MB
|
||||
|
||||
--- No logical volumes defined in test ---
|
||||
|
||||
|
||||
--- Physical volumes ---
|
||||
PV Name (#) /dev/hda3 (2)
|
||||
PV Status available / allocatable
|
||||
Total PE / Free PE 13 / 13
|
||||
|
||||
PV Name (#) /dev/hdb2 (1)
|
||||
PV Status available / allocatable
|
||||
Total PE / Free PE 33 / 33
|
||||
</verb></tscreen>
|
||||
Lots of data here - most of it should be understandable by now. We see that
|
||||
there are no Logical Volumes defined, so we should work to remedy that. We
|
||||
try to generate a 50 megabyte volume called 'HOWTO' in the Volume
|
||||
Group 'test':
|
||||
|
||||
<tscreen><verb>
|
||||
# lvcreate -L 50M -n HOWTO test
|
||||
lvcreate -- rounding up size to physical extent boundary "52 MB"
|
||||
lvcreate -- doing automatic backup of "test"
|
||||
lvcreate -- logical volume "/dev/test/HOWTO" successfully created
|
||||
</verb></tscreen>
|
||||
|
||||
Ok, we're nearly there, let's make a filesystem:
|
||||
|
||||
<tscreen><verb>
|
||||
# mke2fs /dev/test/HOWTO
|
||||
mke2fs 1.18, 11-Nov-1999 for EXT2 FS 0.5b, 95/08/09
|
||||
Filesystem label=
|
||||
OS type: Linux
|
||||
Block size=1024 (log=0)
|
||||
Fragment size=1024 (log=0)
|
||||
13328 inodes, 53248 blocks
|
||||
2662 blocks (5.00%) reserved for the super user
|
||||
First data block=1
|
||||
7 block groups
|
||||
8192 blocks per group, 8192 fragments per group
|
||||
1904 inodes per group
|
||||
Superblock backups stored on blocks:
|
||||
8193, 24577, 40961
|
||||
|
||||
Writing inode tables: done
|
||||
Writing superblocks and filesystem accounting information: done
|
||||
# mount /dev/test/HOWTO /mnt
|
||||
# ls /mnt
|
||||
lost+found
|
||||
</verb></tscreen>
|
||||
|
||||
And we're done! Let's review our Volume Group, because it should be filled
|
||||
up a bit by now:
|
||||
|
||||
<tscreen><verb>
|
||||
# vgdisplay test -v
|
||||
--- Volume group ---
|
||||
VG Name test
|
||||
VG Access read/write
|
||||
VG Status available/resizable
|
||||
VG # 0
|
||||
MAX LV 256
|
||||
Cur LV 1
|
||||
Open LV 1
|
||||
MAX LV Size 255.99 GB
|
||||
Max PV 256
|
||||
Cur PV 2
|
||||
Act PV 2
|
||||
VG Size 184 MB
|
||||
PE Size 4 MB
|
||||
Total PE 46
|
||||
Alloc PE / Size 13 / 52 MB
|
||||
Free PE / Size 33 / 132 MB
|
||||
|
||||
--- Logical volume ---
|
||||
LV Name /dev/test/HOWTO
|
||||
VG Name test
|
||||
LV Write Access read/write
|
||||
LV Status available
|
||||
LV # 1
|
||||
# open 1
|
||||
LV Size 52 MB
|
||||
Current LE 13
|
||||
Allocated LE 13
|
||||
Allocation next free
|
||||
Read ahead sectors 120
|
||||
Block device 58:0
|
||||
|
||||
|
||||
--- Physical volumes ---
|
||||
PV Name (#) /dev/hda3 (2)
|
||||
PV Status available / allocatable
|
||||
Total PE / Free PE 13 / 13
|
||||
|
||||
PV Name (#) /dev/hdb2 (1)
|
||||
PV Status available / allocatable
|
||||
Total PE / Free PE 33 / 20
|
||||
</verb></tscreen>
|
||||
|
||||
Well, it is. /dev/hda3 is completely unused, but /dev/hdb2 has 13 Physical
|
||||
Extents in use.
|
||||
<sect1>Active and Inactive: kernel space and user space
|
||||
<p>
|
||||
As with all decent operating systems, Linux is divided in two parts: kernel
|
||||
space and user space. Userspace is sometimes called userland, which would
|
||||
also be a good name for a theme park, 'Userland'.
|
||||
|
||||
Discovery, creation and modification of things pertaining to Logical Volume
|
||||
Management is done in user space, and then communicated to the kernel. Once
|
||||
a volume group or logical volume is reported to the kernel, it is
|
||||
called 'Active'. Certain changes can only be performed when an entity is
|
||||
active, others only when it is not.
|
||||
|
||||
<sect>Prerequisites
|
||||
<p>
|
||||
There is a wide range of kernels where LVM is available on. In Linux 2.4,
|
||||
LVM will be fully integrated. From kernel 2.3.47 and onwards, LVM is in the
|
||||
process of being merged into the main kernel.
|
||||
|
||||
<sect1>Kernel
|
||||
<sect2>Linux 2.4
|
||||
<p>
|
||||
Will contain everything you need. It is expected that most distributions
|
||||
will release with LVM included as a module. If you need to compile, just
|
||||
tick off the LVM option when selecting your block devices.
|
||||
<sect2>Linux 2.3.99.*
|
||||
<p>
|
||||
Once things have calmed down on the kernel development front, this section
|
||||
will vanish. For now, the gory details.
|
||||
|
||||
As we write this, Linux 2.3.99pre5 is current and it still needs a very tiny
|
||||
patch to get LVM working.
|
||||
|
||||
For Linux 2.3.99pre3, two patches were released:
|
||||
|
||||
The patch was posted on linux-kernel, and is available <url name="here"
|
||||
url="http://ds9a.nl/lvm-howto/2.3.99pre3">.
|
||||
|
||||
Andrea Arcangeli improved on that patch, and supplied
|
||||
<url name="an incremental patch" url="http://ds9a.nl/lvm-howto/andrea.patch">,
|
||||
which should be applied on top of the 2.3.99pre3 LVM patch above.
|
||||
|
||||
For Linux 2.3.99pre5, bert hubert rolled these two patches into one and
|
||||
ported it to 2.3.99pre5. <url name="Patch"
|
||||
url="http://ds9a.nl/lvm-howto/2.3.99-pre5.lvm.patch">. Use with care.
|
||||
|
||||
2.3.99pre6-1, yes, a prerelease of a prepatch, features for the first time
|
||||
complete LVM support! It stil misses Andreas patch but we have been assured
|
||||
that it is in the queue to be released real soon.
|
||||
|
||||
2.3.99pre4-ac1 has the tiny LVM patch in by default, and working. It does
|
||||
not contain Andreas patch though.
|
||||
|
||||
<sect2>Linux 2.2
|
||||
<p>FIXME: write this
|
||||
<sect2>Linux 2.3
|
||||
<p>
|
||||
FIXME: write this
|
||||
<sect1>Userspace
|
||||
<p>
|
||||
You need the tools available from the <url name="LVM site"
|
||||
url="http://lvm.msede.com/lvm">. Compiling them on glibc2.1 systems requires
|
||||
a tiny patch, and even then gives errors on Debian 2.2.
|
||||
|
||||
|
||||
<sect>Growing your filesystem
|
||||
<p>
|
||||
You can do this with a provided script which does a lot of work for you, or
|
||||
you can do it by hand if needed.
|
||||
<sect1>With e2fsadm
|
||||
<p>
|
||||
If there is room within your volume group, and you use the ext2 filesystem
|
||||
(most people do), you can use this handy tool.
|
||||
|
||||
The <tt>e2fsadm</tt> command uses the commercial resize2fs tool. While
|
||||
people feel that this is good software, it is not very widely installed.
|
||||
|
||||
If you want to use the FSF's <tt>ext2resize</tt> command, you need to inform
|
||||
<tt>e2fsadm</tt> of this:
|
||||
|
||||
<tscreen><verb>
|
||||
# export E2FSADM_RESIZE_CMD=ext2resize
|
||||
# export E2FSADM_RESIZE_OPTS=""
|
||||
</verb></tscreen>
|
||||
|
||||
The rest is easy, <tt>e2fsadm</tt> is a lot like the other LVM commands:
|
||||
|
||||
<tscreen><verb>
|
||||
# e2fsadm /dev/test/HOWTO -L+50M
|
||||
e2fsadm -- correcting size 102 MB to physical extent boundary 104 MB
|
||||
e2fsck 1.18, 11-Nov-1999 for EXT2 FS 0.5b, 95/08/09
|
||||
Pass 1: Checking inodes, blocks, and sizes
|
||||
Pass 2: Checking directory structure
|
||||
Pass 3: Checking directory connectivity
|
||||
Pass 4: Checking reference counts
|
||||
Pass 5: Checking group summary information
|
||||
/dev/test/HOWTO: 11/25688 files (0.0% non-contiguous), 3263/102400 blocks
|
||||
lvextend -- extending logical volume "/dev/test/howto" to 104 MB
|
||||
lvextend -- doing automatic backup of volume group "test"
|
||||
lvextend -- logical volume "/dev/test/HOWTO" successfully extended
|
||||
|
||||
ext2_resize_fs
|
||||
ext2_grow_fs
|
||||
ext2_block_relocate
|
||||
ext2_block_relocate_grow
|
||||
ext2_grow_group
|
||||
ext2_add_group
|
||||
ext2_add_group
|
||||
ext2_add_group
|
||||
ext2_add_group
|
||||
ext2_add_group
|
||||
ext2_add_group
|
||||
direct hits 4096 indirect hits 0 misses 1
|
||||
e2fsadm -- ext2fs in logical volume "/dev/test/HOWTO" successfully extended to 104 MB
|
||||
</verb></tscreen>
|
||||
|
||||
<sect1>Growing your Logical Volume
|
||||
<p>
|
||||
The <tt>e2fsadm</tt> command takes care of this for you. However, it may be
|
||||
useful to understand how to do this manually:
|
||||
|
||||
If you have room within your Volume Group, this is a one liner:
|
||||
<tscreen><verb>
|
||||
# lvextend -L+12M /dev/test/HOWTO
|
||||
lvextend -- rounding size to physical extent boundary
|
||||
lvextend -- extending logical volume "/dev/test/HOWTO" to 116 MB
|
||||
lvextend -- doing automatic backup of volume group "test"
|
||||
lvextend -- logical volume "/dev/test/HOWTO" successfully extended
|
||||
</verb></tscreen>
|
||||
|
||||
<sect1>Growing your Volume Group
|
||||
<p>
|
||||
This is done with the vgextend utility, and is easy as pie. You first need
|
||||
to create a physical volume. This is done with the <tt>pvcreate</tt>
|
||||
utility. With this tool, you convert any block device into a physical
|
||||
volume.
|
||||
|
||||
After that is done, <tt>vgextend</tt> does the rest:
|
||||
<tscreen><verb>
|
||||
# pvcreate /dev/sda1
|
||||
pvcreate -- physical volume "/dev/sda1" successfully created
|
||||
# vgextend webgroup /dev/sda1
|
||||
vgextend -- INFO: maximum logical volume size is 255.99 Gigabyte
|
||||
vgextend -- doing automatic backup of volume group "webgroup"
|
||||
vgextend -- volume group "webgroup" successfully extended
|
||||
</verb></tscreen>
|
||||
|
||||
Please note that in order to do this, your Volume Group needs to be
|
||||
active. You can make it by executing 'vgchange -a y webgroup'.
|
||||
|
||||
<sect1>Growing your filesystem
|
||||
<p>
|
||||
If you want to do this manually, there are a couple of ways to do this.
|
||||
<sect2>ext2 off-line with ext2resize
|
||||
<p>
|
||||
By off-line, we mean that you have to unmount the file-system to make
|
||||
these changes. The file-system and it's data will be unavailable while
|
||||
doing this. Note this means you must use other boot media if extending
|
||||
the size of the root or other important partitions.
|
||||
<p>
|
||||
The ext2resize tool is available on the GNU ftp size, but most distributions
|
||||
carry it as a package. The syntax is very straightforward:
|
||||
<tscreen><verb>
|
||||
# ext2resize /dev/HOWTO/small 40000
|
||||
</verb></tscreen>
|
||||
Where 40000 is the number of blocks the filesystem should have after growing
|
||||
or shrinking.
|
||||
|
||||
<sect2>ext2 on-line
|
||||
<p>
|
||||
FIXME: write this
|
||||
|
||||
<sect>Replacing disks
|
||||
<p>
|
||||
This is one of the benefits of LVM. Once you start seeing errors
|
||||
on a disk, it is high time to move your data. With LVM this is easy as pie.
|
||||
We first do the obvious replacement example where you add a disk to the
|
||||
system that's at least as large as the one you want to replace.
|
||||
|
||||
To move data, we move Physical Extents of a Volume Group to another disk, or
|
||||
more precisely, to another Physical Volume. For this, LVM offers us the
|
||||
<tt>pvmove</tt> utility.
|
||||
|
||||
Let's say that our suspicious disk is called /dev/hda1 and we want to replace
|
||||
it by /dev/sdb3. We first add /dev/sdb3 to the Volume Group that contains
|
||||
/dev/hda1.
|
||||
|
||||
It appears advisable to unmount any filesystems on this Volume Group before
|
||||
doing this. Having a full backup might not hurt either.
|
||||
|
||||
FIXME: is this necessary?
|
||||
|
||||
We then execute <tt>pvmove</tt>. In its simplest invocation, we
|
||||
just mention the disk we want to remove, like this:
|
||||
|
||||
<tscreen><verb>
|
||||
# pvmove /dev/hda1
|
||||
pvmove -- moving physical extents in active volume group "test1"
|
||||
pvmove -- WARNING: moving of active logical volumes may cause data loss!
|
||||
pvmove -- do you want to continue? [y/n] y
|
||||
pvmove -- doing automatic backup of volume group "test1"
|
||||
pvmove -- 12 extents of physical volume "/dev/hda1" successfully moved
|
||||
</verb></tscreen>
|
||||
|
||||
Please heed this warning. Also, it appears that at least some kernels or LVM
|
||||
versions have trouble with this command. I tested it with 2.3.99pre6-2, and
|
||||
it works, but be warned.
|
||||
|
||||
Now that /dev/hda1 contains no Physical Extents anymore, we can reduce it
|
||||
from the Volume Group:
|
||||
|
||||
<tscreen><verb>
|
||||
# vgreduce test1 /dev/hda1
|
||||
vgreduce -- doing automatic backup of volume group "test1"
|
||||
vgreduce -- volume group "test1" successfully reduced by physical volume:
|
||||
vgreduce -- /dev/hda1
|
||||
</verb></tscreen>
|
||||
|
||||
FIXME: we need clarity on a few things. Should the volume group be active?
|
||||
When do we get data loss?
|
||||
|
||||
<sect1>When it's too late
|
||||
<p>
|
||||
If a disk fails without warning and you are unable to move the Physical Extents
|
||||
off it to a different Physical Volume you will have lost data unless the
|
||||
Logical Volumes on the PV that failed was mirrored. The correct course of
|
||||
action is to replace the failed PV with an identical one or at least a
|
||||
partition of the same size.
|
||||
|
||||
The directory /etc/lvmconf contains backups
|
||||
of the LVM data and structures that make the disks into Physical Volumes and
|
||||
list which Volume Groups that PV belongs to and what Logical Volumes are
|
||||
in the Volume Group.
|
||||
|
||||
After replacing the faulty disk you can use the
|
||||
<tt>vgcfgrestore</tt> command to recover the LVM data to the new PV. This
|
||||
restores the Volume Group and all it's info, but it does not restore the
|
||||
data that was in the Logical Volumes. This is why most LVM commands make
|
||||
backups automatically of the LVM data when doing changes.
|
||||
|
||||
<sect>Making snapshots for consistent backups
|
||||
<p>
|
||||
This is one of the more incredible possibilities. Let's say you have a busy
|
||||
server, with lots of things happening. For a useful backup, you need to shut
|
||||
down a large number of programs because otherwise you end up with inconsistencies.
|
||||
|
||||
The canonical example is moving a file from /tmp to /root, where /root was
|
||||
being backed up first. When /root was read, the file wasn't there yet. By
|
||||
the time /tmp was backed up, the file was gone.
|
||||
|
||||
Another story goes for saving databases or directories. We have no clue if a
|
||||
file is in any usable state unless we give the application time to do a
|
||||
clean shutdown.
|
||||
|
||||
Which is where another problem pops up. We shut down out applications, make
|
||||
our backup, and restart them again. This is all fine as long as the backup
|
||||
only takes a few minutes, but gets to be real painful if it takes hours, or
|
||||
if you're not even sure how long it takes.
|
||||
|
||||
LVM to the rescue.
|
||||
|
||||
With LVM we can make a snapshot picture of a Logical Volume which is
|
||||
instantaneous, and then mount that and make a backup of it.
|
||||
|
||||
Let's try this out:
|
||||
|
||||
<tscreen><verb>
|
||||
# mount /dev/test/HOWTO /mnt
|
||||
# echo > /mnt/a.test.file
|
||||
# ls /mnt/
|
||||
a.test.file lost+found
|
||||
# ls -l /mnt/
|
||||
total 13
|
||||
-rw-r--r-- 1 root root 1 Apr 2 00:28 a.test.file
|
||||
drwxr-xr-x 2 root root 12288 Apr 2 00:28 lost+found
|
||||
</verb></tscreen>
|
||||
|
||||
Ok, we now have something to work with. Let's make the snapshot:
|
||||
|
||||
<tscreen><verb>
|
||||
# lvcreate --size 16m --snapshot --name snap /dev/test/HOWTO
|
||||
lvcreate -- WARNING: all snapshots will be disabled if more than 16 MB are changed
|
||||
lvcreate -- INFO: using default snapshot chunk size of 64 KB
|
||||
lvcreate -- doing automatic backup of "test"
|
||||
lvcreate -- logical volume "/dev/test/HOWTO" successfully created
|
||||
</verb></tscreen>
|
||||
|
||||
More on the '--size' parameter later. Let's mount the snapshot:
|
||||
|
||||
<tscreen><verb>
|
||||
# mount /dev/test/snap /snap
|
||||
# ls /snap
|
||||
total 13
|
||||
-rw-r--r-- 1 root root 1 Apr 2 00:28 a.test.file
|
||||
drwxr-xr-x 2 root root 12288 Apr 2 00:28 lost+found
|
||||
</verb></tscreen>
|
||||
Now we erase a.test.file from the original, and check if it's still there in
|
||||
the snapshot:
|
||||
<tscreen><verb>
|
||||
# rm /mnt/a.test.file
|
||||
# ls /snap
|
||||
total 13
|
||||
-rw-r--r-- 1 root root 1 Apr 2 00:28 a.test.file
|
||||
drwxr-xr-x 2 root root 12288 Apr 2 00:28 lost+found
|
||||
</verb></tscreen>
|
||||
|
||||
Amazing Mike!
|
||||
<sect1>How does it work?
|
||||
<p>Remember that we had to set the '--size' parameter? What really happens
|
||||
is that the 'snap' volume needs to have a copy of all blocks or 'chunks' as
|
||||
LVM calls them, which are changed in the original.
|
||||
|
||||
When we erased a.test.file, it's inode was removed. This caused 64 KB to be
|
||||
marked as 'dirty' - and a copy of the original data was written to the
|
||||
'snap' volume. In this case we allocated 16MB for the snapshot, so if more
|
||||
than 16MB of "chunks" are modified, the snapshot will be deactivated.
|
||||
|
||||
To determine the correct size for a snapshot partition, you will have to
|
||||
guess based on usage patterns of the primary LV, and the amount of time
|
||||
the snapshot will be active. For example, an hour-long backup in the
|
||||
middle of the night when nobody is using the system may require
|
||||
very little space.
|
||||
|
||||
Please note that snapshots are not persistent. If you unload LVM or reboot,
|
||||
they are gone, and need to be recreated.
|
||||
|
||||
<sect>Redundancy & Performance
|
||||
<p>
|
||||
For performance reasons, it is possible to spread data in a 'stripe' over
|
||||
multiple disks. This means that block 1 is on Physical Volume A, and block 2
|
||||
is on PV B, while block 3 may be on PV A again. You can also stripe over
|
||||
more than 2 disks.
|
||||
|
||||
This arrangement means that your have more disk bandwidth available. It also
|
||||
means that more 'spindles' are involved. More on this later.
|
||||
|
||||
Besides increasing performance, it is also possible to have your data in
|
||||
copies on multiple disks. This is called mirroring. Currently, LVM does not
|
||||
support this natively but there are ways to achieve this.
|
||||
|
||||
<sect1>Why stripe?
|
||||
<p>
|
||||
Disk performance is influenced by three things, at least. The most obvious
|
||||
is the speed at which data on a disk can be read or written sequentially.
|
||||
This is the limiting factor when reading or writing a large file on a
|
||||
SCSI/IDE bus with only a single disk on it.
|
||||
|
||||
Then there is the bandwidth available TO the disk. If you have 7 disks on a
|
||||
SCSI bus, this may well be less than the writing speed of your disk itself.
|
||||
If you spend enough money, you can prevent this bottleneck from being a
|
||||
problem.
|
||||
|
||||
Then there is the latency. As the saying goes, latency is always bad news.
|
||||
And even worse, you can't spend more money to get lower latency! Most disks
|
||||
these days appear to have a latency somewhere around 7ms. Then there is the
|
||||
SCSI latency, which used to be something like 25ms.
|
||||
|
||||
FIXME: need recent numbers!
|
||||
|
||||
What does this mean? The combined latency would be around 30ms in a typical
|
||||
case. You can therefore perform only around 33 disk operations per second.
|
||||
If you want to be able to do many thousands of queries per second, and you
|
||||
don't have a massive cache, you are very much out of luck.
|
||||
|
||||
If you have multiple disks, or 'spindles', working in parallel, you can have
|
||||
multiple commands being performed concurrently, which nicely circumvents
|
||||
your latency problem. Some applications, like a huge news server, don't even
|
||||
work anymore without striping or other IO smartness.
|
||||
|
||||
This is what striping does. And, if your bus is up to it, even sequential
|
||||
reading and writing may go faster.
|
||||
<sect1>Why not
|
||||
<p>
|
||||
Striping without further measures raises your fault chance, on a 'per bit'
|
||||
basis. If any of your disks dies, your entire Logical Volume is gone. If you
|
||||
just concatenate data, only part of your filesystem is gone.
|
||||
|
||||
The ultimate option is the mirrored stripe.
|
||||
|
||||
FIXME: make a mirrored stripe with LVM and md
|
||||
<sect1>LVM native striping
|
||||
<p>
|
||||
Specifying stripe configuration is done when creating the Logical Volume
|
||||
with lvcreate. There are two relevant parameters. With -i we tell LVM how
|
||||
many Physical Volumes it should use to scatter on. Striping is not really
|
||||
done on a bit-by-bit basis, but on blocks. With -I we can specify the
|
||||
granulation in kilobytes. Note that this must be a power of 2, and that the
|
||||
coarsest granulation is 128Kbyte.
|
||||
|
||||
Example:
|
||||
<tscreen><verb>
|
||||
# lvcreate -n stripedlv -i 2 -I 64 mygroup -L 20M
|
||||
lvcreate -- rounding 20480 KB to stripe boundary size 24576 KB / 6 PE
|
||||
lvcreate -- doing automatic backup of "mygroup"
|
||||
lvcreate -- logical volume "/dev/mygroup/stripedlv" successfully created
|
||||
</verb></tscreen>
|
||||
|
||||
<sect2>Performance notices
|
||||
<p>
|
||||
The performance 'gain' may well be very negative if you stripe over 2 partitions
|
||||
of the same disk - take care to prevent that. Striping with two disks on a
|
||||
single IDE bus also appears useless - unless IDE has improved beyond what I
|
||||
remember.
|
||||
|
||||
FIXME: is this still true?
|
||||
|
||||
Older motherboards may have two IDE buses, but the second one used to be
|
||||
castrated, dedicated to serving a slow cdrom drive. You can perform
|
||||
benchmarks with several tools, the most noteworthy being 'Bonnie'. The
|
||||
ReiserFS people have released <url name="Bonnie++"
|
||||
url="http://www.coker.com.au/bonnie++/"> which may be used to measure
|
||||
performance data.
|
||||
|
||||
<sect1>Hardware RAID
|
||||
<p>
|
||||
Many high end Intel x86 servers have Hardware RAID controlers. Most of
|
||||
them have atleast 2 independant SCSI channels. Fortunatly, his has very
|
||||
little bearing on LVM. Before Linux can see anything on such a controler
|
||||
the administrator must define a Logical drive within the raid controler
|
||||
itself. As an example [s]he could choose to stripe together two disks on
|
||||
SCSI channel A and then mirror them onto two disks on channel B. This
|
||||
is a typical RAID 0/1 configuration that maximises performance and
|
||||
data security. When Linux boots on a machine with this configuration
|
||||
it can only 'see' one disk on the RAID controler and that is the
|
||||
Logical drive that contains the four disks in the RAID 0/1 stripeset.
|
||||
This means, as far as LVM is concerned, that there is just one disk
|
||||
in the machine and it is to be used as such. If one of the disks
|
||||
fails, LVM wont even know. When the administrator replaces the disk
|
||||
(even on the fly with HotSwap hardware) LVM wont know about that
|
||||
either and the controler will resync the mirrored data and all will
|
||||
be well.
|
||||
This is where most people take a step back and ask "Then what good
|
||||
does LVM do for me with this RAID controler?"
|
||||
The easy answer is, in most cases, after you define a logical
|
||||
drive in the RAID controler you cant add more disks to that drive
|
||||
later. So if you miscalculate the space requirements or you
|
||||
simply need more space you cant add a new disk or set of disks
|
||||
into a pre-exsisting stripeset. This means you must create a new
|
||||
RAID stripset in the controler and then with LVM you can simply
|
||||
extend the LVM Logical volume so that it seamlessly spans both
|
||||
stripesets in the RAID controler.
|
||||
|
||||
FIXME: Is there more needed on this subject ?
|
||||
|
||||
<sect1>Linux software RAID
|
||||
<p>
|
||||
Linux 2.4 comes with very good RAID in place. Linux 2.2 by default, as
|
||||
released by Alan Cox, features an earlier RAID version that's not well
|
||||
regarded. The reason that 2.2 still features this earlier release is the the
|
||||
kernel people don't want to make changes within a stable version that
|
||||
require userland updates.
|
||||
|
||||
Most people, which included Red Hat, Mandrake and SuSE, chose to replace it
|
||||
with the 0.90 version which appears to be excellent.
|
||||
|
||||
We will only treat the 0.90 version here.
|
||||
|
||||
FIXME: write more of this
|
||||
|
||||
<sect>Cookbook
|
||||
<p>
|
||||
<sect1>Moving LVM disks between computers
|
||||
<p>
|
||||
With all this new technology, simple tasks like moving disks from one machine
|
||||
to another can get a bit tricky. Before LVM users only had to put the disk
|
||||
into the new machine and mount the filesystems. With LVM there is a bit more
|
||||
to it. The LVM structures are saved both on the disks and in the /etc/lvmconf
|
||||
directory so the only thing that has to be done to move a disk or a set of
|
||||
disks that contain a Volume Group is to make sure the machine that the
|
||||
VG belonged to will not miss it. That is accomplished with the <tt>vgexport</tt>
|
||||
command. <tt>vgexport</tt> simply removes the structures for the VG from
|
||||
/etc/lvmconf, but does not change anything on the disks. Once the disks are
|
||||
in the new machine (they don't have to have the same ID's) the only thing
|
||||
that has to be done is to update /etc/lvmconf. Thats done with <tt>vgimport</tt>.
|
||||
|
||||
Example:
|
||||
|
||||
On machine #1:
|
||||
<tscreen><verb>
|
||||
vgchange -a n vg01
|
||||
vgexport vg01
|
||||
</verb></tscreen>
|
||||
On machine #2:
|
||||
<tscreen><verb>
|
||||
vgimport vg01 /dev/sda1 /dev/sdb1
|
||||
vgchange -a y vg01
|
||||
</verb></tscreen>
|
||||
|
||||
Notice that you don't have to use the same name for the Volume Group. If the
|
||||
vgimport command did not save a configuration backup use <tt>vgcfgbackup</tt>
|
||||
to do it.
|
||||
|
||||
<sect1>Rebuilding /etc/lvmtab and /etc/lvmtab.d
|
||||
<p>
|
||||
|
||||
FIXME: write about more neat stuff
|
||||
|
||||
<sect>Further reading
|
||||
<p>
|
||||
<descrip>
|
||||
<tag><url name="LVM site" url="http://lvm.msede.com/lvm/"></tag>
|
||||
The main LVM resource available
|
||||
<tag><url name="German LVM HOWTO" url="http://litefaden.com/lite00/lvm/"></tag>
|
||||
If you can read German, this already contains a lot of information
|
||||
<tag><url name="Translation of the German HOWTO" url=
|
||||
"ftp://linux.msede.com/howto/"></tag>
|
||||
Peter.Wuestefeld@resnova.de is translating the German HOWTO into English. It
|
||||
appears that they will soon be investing lots of time in it. If you doubt
|
||||
our HOWTO or miss something, please try their effort.
|
||||
<tag><url name="HP/UX Managing Disks Guide"
|
||||
url="http://docs.hp.com/cgi-bin/omcgi/omdoc?action=getcon&ID=7425"></tag>
|
||||
Since the Linux LVM is almost an exact workalike of the HP/UX
|
||||
implementation, their documentation is very useful to us as well. Very good
|
||||
stuff.
|
||||
</descrip>
|
||||
|
||||
<sect>Acknowledgements & Thanks to
|
||||
<p>
|
||||
We try to list everybody here who helped make this HOWTO. This includes
|
||||
people who send in updates, fixes or contributions, but also people who
|
||||
have aided our understanding of the subject.
|
||||
<itemize>
|
||||
<item> Axel Boldt <axel@uni-paderborn.de></item>
|
||||
<item> Sean Reifschneider <jafo@tummy.com>
|
||||
<item> Alexander Talos <at@atat.at>
|
||||
<item> Eric Maryniak <e.maryniak@pobox.com>
|
||||
</itemize>
|
||||
</article>
|
||||
|
Loading…
Reference in New Issue