Partitions-Mass-Storage-Definitions-Naming-HOWTO

Jean-Daniel Dodin

jdd@dodin.org

Partitions-Mass-Storage-Definitions-Naming-HOWTO V0.1 2009-05-09 jdd

Copyright and Licence The copyright of this document is to the author, Jean-Daniel Dodin, according to the following licence. GNU Free Documentation License

Mass Storage Involved Here Mass storage involved in the present HOWTO are rewritable random access ones. Most of them are magnetic rotating disks (floppies, Hard Drive) or flash memory (USB key or any kind of memory card). For example, cdroms and dvds are notconcerned by this HOWTO ( see Wikipedia). Tapes are not either. Mass storage are used by the kernel, so the basic doc can be found on the kernel Web site Reference site should be the International Disk drive Equipment and Materials Association. Shouldbecause this Web site is not very friendly.

Definitions

Warning Many definitions about drives are only virtual. That is they are used, but the hardware is often quite different from the expected description. Usually this have no odd result, any mass storage have to be seen as a black box.

Bytes Computers counts with binaries, 1 & 0, 1111100001110... To be able to read this better, humans uses nibbles (4 bits) often shown as Hexadecimal numbers from 0 to f (0123456789abcdef). Nibbles are usually grouped by two and this gives a byte. The most used memory unit is byte and it's multiples, KiB (Kilo Bytes), MiB (Mega Bytes), GiB (Gigabytes). The "i" denotes the binary use (0ne Ki is 1024, not 1000), the uppercase "B" denotes Bytes, not bits.

Sectors Sometime, the word blockis used in place of sectors. Mass storage devices (at least the ones we are dealing with here) store bytes in "Sectors" of 512 Bytes. This is uneven, because any sector count have to be divided by two to have the KiB number, so most partitonning software accepts letters k (KiB), m (MiB), g... as options. Wise ones do not make any case difference. Sector size is the available byte count. The true sector is bigger, as it have to include housekeeping data. You don't have to worry about that. Notice that as of 03-22-2006, the IDEMA annouced a new sector size of 4kiB (4096 Bytes): - doc file, can be openned with OpenOffice.org.

Heads Rotating mass storage devices uses heads. True heads are the physical electromechanical device that writes and read the magnetic track. Drives being made of rotating plates, the plates have two sides, so disks can have two head by plate. Having two plates (frequent) you have four heads. Heads are writing through very complex system, see detailed info here: .

Tracks Plates are rotating. When a head is still, the plate rotation and the width of the head are defining a track. Heads are moving from the external part of the plate to the inner part, step by steps. Each step defines a new track.

Cylinders Heads are moving together, all at the same time. They may rotate - on they own center, not the plate center, of course. They also may have a linear move. You can see an example of linear moving head in any cdreader, looking at the move of the laser head. Most disks are as shown by this wikipedia image . When you think of all the tracks defined by each head at the same time, you have a cylinder. So on a rotating drive, all the tracks of the same cylinder are read or written at the same time. The actual data is spread on all the plates. The way the data is actually written is up to the drive manufacturer, not the user.

Disks Small disks are used directly as a hole bunch of sectors. Basic programs can access data directly on sectors. Many do (like dd or any partitionning programm). But we live in a world of extremely high capacity mass storage. Terabytes is normal nowaday (2009), when a complete Linux system can live on a floppy (1440 bytes). So there is a need of making several parts from a mass storage device, though the partitions.

Partitions Partitioning is a means to divide a single drive into many logical drives. A partition is a contiguous set of sectors. To lessen the heads travel, partitions can be "aligned" on the cylinder size, that is use an integer number of cylinder. This is not always done, but should as it have many other advantages for recovery.

Partition Table As you can have many partitions, you need to have a partition table. This partition table is stored in the very beginning of the drive. It's very unlikely that you will have to change this table directly writing bytes with an hexadecimal editor, so we wont say more on the position of the table. There are many Operating Systems all around that all share similar hardware and as many partition systems. We will look only at what one can find in a PC, even if it's not easy to define that nowaday. Say, for us, a PC is any computer able to run Linux (I know, it's not always true). Each of these partition kinds are noted in the table by a special flag called "type" ("t" in fdisk). Most known are type 83 for Linux partitions and 82 for Linux swap (hex numbers). Notice that most Operating Systems can share partition tables. At least, if a disk is hardware compatible with several systems, these systems should be able to see what the others have done, not to erase a drive by accident. I can't say for sure that its true in the real life.

File Systems Partitions can be accessed directly as sectors, as any part of the disk, but are usually filled with a file system. File system and partitions are related only because a file system is in a partition, but that's all. You can have a disk without partition but with a file system or have partitions without file system (the swap partition beeing the most well known). For details on file systems, see Wikipedia. In summary, file systems allow storing data in files with human readable names and to sort the files in a friendly way, for example as directories, subdirectories, text, images...

Files and Nodes Nearly all what you can find on a mass storage partition, beside sectors, from an user point of view, is a file. But computers are curious geeks and you can treat files like disks if you want. Using the "loop" system, default in most Linux kernels, one can partition the inside of the file, create file systems on it and mount it. This is specially handy for experiments. Some of these files are devicesor nodes. Partitions are not files and are accessed via special nodes we will see later. These nodes are not created by touch but by mknode. Use with caution. Nodes need a type ( cfor "character" or "b" for block) and major and minor numbers. For what we need, major numbers are disk numbers and minor numbers are partition numbers. The list is visible in /proc/partitions Creates a /dev/sda9 node of no nuse, given this don't create partition, only the node. In a usual Linux distribution, nodes are dynamically created at boot time, so nobody should have to do so. However, sometime the automatic system fails.

Drive Naming in Linux There is a special nomenclature that linux uses to refer to mass storage that must be understood.

Naming Convention Linux used to deal with two kind of drives, depending of the electronic interface (controller), IDE and SCSI. Oldtimers remember the day where cdwriters where acccessed through "SCSI emulation". In fact IDE and SCSI use mostly the same low level commands and for 2007 up, with the new "SATA" interface, the naming was unified and, in new ditributions, all the drives have the same naming. For this part, CD or DVD readers/writers are seen like Hard Drives.

Old IDE Names By convention, IDE drives where given device names /dev/hdato /dev/hdd. Hard Drive A( /dev/hda) is the first drive and Hard Drive C( /dev/hdc) is the third. A typical PC has two IDE controllers, each of which can have two drives connected to it. For example, /dev/hdais the first drive (master) on the first IDE controller and /dev/hddis the second (slave) drive on the second controller (the fourth IDE drive in the computer). So, typically, a computer with IDE controller can accomodate 4 drives: /dev/hda (primary master), /dev/hdb (primary slave), /dev/hdc (secondary master), /dev/hdd (secondary slave). Some (rare) Mother Boards have more than two controllers, some addition cards can also have controllers, these are numbered following the alphabet, but one have to figure out what real names are given for his particular hardware. You can have drives where ever you want, it's not mandatory to fill the gaps. You may have interest to read about what drive/cdrom connect to what place, but it's out of this document scope.

New Hard Drives Names Now all the rotating hard drives uses the same names as the old SCSI controllers, that is "s" in place of "h", so /dev/sda, and so on. The number of drives depends on the number of controllers on the Mother Board or the extended boards. Usually 4 are available. What will be the number of a drive is up to the controller card and the way it's read by the kernel, so difficult to say at first.

Flash Drives Names Flash drives are usually not connected through IDE or SATA interfaces and so don't uses the same names. Several interfaces are used with each different names. The kernel documentations gives the names.

Low level Devices and Extra naming You will find in some apps references to lowlevel SCSI devices and various naming conventions, for example (wodim is the command line cd burner): And you may have to use some sort of SCSI:1,1,0option to access the CDROM. try to avoid using this as much as possible, as it's very error prone and should be let to programmers only. I only mention it because you can't always avoid it. If you do "cat /dev/ | more", you can see: sr0 (...) crw-r----- 1 root disk 21, 0 mars 9 07:56 sg0 crw-rw----+ 1 root disk 21, 1 mars 9 07:56 sg1]]> These scd, sr, sg devices are lowlevel interface (notice the "c" for "character"). Try not using them. dmesgand more /var/log/boot.msgshould give you the usable sdxx device, like (short summary): sd 0:0:0:0: [sda] 976773168 512-byte hardware sectors: (500GB/465GiB) <5>sd 0:0:0:0: [sda] Write Protect is off <7>sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00]]> This mean the drive is /dev/sda. However these files (given by dmesgand more /var/log/boot.msg) used to be easy to read but are no more. Now the kernel starts in parallel several drivers, so the messages are mixed, you can have sda:<6>USB Universal Host Contr'ller Interface driver v3.0]]> This don't mean that your sda drive is an usb one, but the usb module was started at the same time as the drive one and send it's messages simultaneously. You still have a /dev/sdadrive.

New Media Names Here the dmesg content for inserting an USB key: You see there all what we where speaking about right now! SCSI emulation, scsi, sd and sg names, but also the sdb that is most important for us. Here are the messages for a high speed SDHC card: When the two cards are probably the same flash memory chip, the USB key uses the USB interface and SCSI emulation, the SDHC card uses the PCMCIA slot of the laptop, with a special device naming (/dev/mmcblk0). The use, as far as partitionning is involved is the same.

Disk ID In a world where disks are many and removable, it's impossible to track what device is used by what disk. So there are now many way of using a disk name. This makes it extremely difficult to work with basic tools. These are "Disk labels" and "Disk UUID", also "Partition Labels". See fstab man page for details.

Partition Naming in Linux

Numbers Partition naming is thanksfully simpler than drive one. Partitions are simply given a number from 0 up (decimal). Sometime a "p" is appended on front of the number: As you see, partition devices are listed in /proc/partition. This file... is not a real file but is created on the fly. Don't worry, for what we need it's a file. Notice the "p1" partition name for the SDHC card. Max number of partitions is 15 for SCSI and all the drives using the new SATA driver, 63 for IDE drives (0 is the full drive, 0 to 15 is four bits 0 to 64, 6 bits)

Meaning of the Numbers Not all the numbers have the same meaning. This mess come from the PC history. One can divide floppies with partitions, but then 4 ones seems sufficient. But then come Hard drives :-). So the partitons numbers 1, 2, 3 and 4 are primarypartitions. One drive can only have 4 primaries. To go further, we have to use one of these primary as a big one and sub-partition this one, so to have logicalpartitions. The big extendedpartition can be any of the 4. So, remember, the primary partitions are inside the drive and the logical partitions are inside one of the primary, called the extendedpartition. Once the logical partitions are created, it's no more recommended to write directly to the extended one. Writing to an extended partition would erase the logical ones like writing directly to a hard drive erase the partitons. Beware, it's possible!! If, after creating 4 primary partitions, all the disk space is not used, the remaining space is lost (unusable), so most of the time, create the desired primaries, then at last the extended one with all the remaining room. It's not necessary to create 4 primaries. You could use only one extended (Linux only), but there are some advantages of using primaries. Primaries being 4, the first logical partition is always 5. So any partition with number of five and up is a logical one.

Device Major and Minor Numbers The only important thing with a device file are its major and minor device numbers, which are shown instead of the file size: Shows permissions ( brw-rw----), owner (root), group (disk), major device number (8), minor device number (0), date (mars 9 - french, no year), hour (07:56) and device name (guess :-). When accessing a device file, the major number selects which device driver is being called to perform the input/output operation. This call is being done with the minor number as a parameter and it is entirely up to the driver how the minor number is being interpreted. The driver documentation usually describes how the driver uses minor numbers.

Partition Types

Linux Partition Types A partition is labeled to host a certain kind of file system (not to be confused with a volume label. Such a file system could be the linux standard ext3 file system or linux swap space, or even foreign file systems like (Microsoft) NTFS or (Sun) UFS. There is a numerical code associated with each partition type. For example, the code for ext2 is 0x83 and linux swap is 0x82 (0x mean hexadecimal).

Foreign Partition Types The partition type codes have been arbitrarily chosen (you can't figure out what they should be) and they are particular to a given operating system. Therefore, it is theoretically possible that if you use two operating systems with the same hard drive, the same code might be used to designate two different partition types. OS/2 marks its partitions with a 0x07 type and so does Windows NT's NTFS. MS-DOS allocates several type codes for its various flavors of FAT file systems: 0x01, 0x04 and 0x06 are known. DR-DOS used 0x81 to indicate protected FAT partitions, creating a type clash with Linux/Minix at that time, but neither Linux/Minix nor DR-DOS are widely used any more.

Swap Partitions Every process running on your computer is allocated a number of blocks of RAM. These blocks are called pages. The set of in-memory pages which will be referenced by the processor in the very near future is called a "working set." Linux tries to predict these memory accesses (assuming that recently used pages will be used again in the near future) and keeps these pages in RAM if possible. If you have too many processes running on a machine, the kernel will try to free up RAM by writing pages to disk. This is what swap space is for. It effectively increases the amount of memory you have available. However, disk I/O is about a hundred times slower than reading from and writing to RAM. Consider this emergency memory and not extra memory. If memory becomes so scarce that the kernel pages out from the working set of one process in order to page in for another, the machine is said to be thrashing. Some readers might have inadvertenly experienced this: the hard drive is grinding away like crazy, but the computer is slow to the point of being unusable. Swap space is something you need to have, but it is no substitute for sufficient RAM.

Complete List From the fdisk help:

How Many Partitions The exact number of partitions allowed on a drive is fixed by the kernel. So you can find the exact number is the kernel documentation, the last version is maintained here If you have the kernel source installed, you can find your version on your computer at /usr/src/linux/Documentation/devices.txt. Look at "limit on partition". Find yours. Common SATA number is 31, SCSI is 15, some are less.