old-www/HOWTO/Unix-and-Internet-Fundament.../disk-layout.html

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML
><HEAD
><TITLE
>How does my computer store things on disk?</TITLE
><META
NAME="GENERATOR"
CONTENT="Modular DocBook HTML Stylesheet Version 1.7"><LINK
REL="HOME"
TITLE="The Unix and Internet Fundamentals HOWTO"
HREF="index.html"><LINK
REL="PREVIOUS"
TITLE="How does my computer store things in memory?"
HREF="core-formats.html"><LINK
REL="NEXT"
TITLE="How do computer languages work?"
HREF="languages.html"></HEAD
><BODY
CLASS="sect1"
BGCOLOR="#FFFFFF"
TEXT="#000000"
LINK="#0000FF"
VLINK="#840084"
ALINK="#0000FF"
><DIV
CLASS="NAVHEADER"
><TABLE
SUMMARY="Header navigation table"
WIDTH="100%"
BORDER="0"
CELLPADDING="0"
CELLSPACING="0"
><TR
><TH
COLSPAN="3"
ALIGN="center"
>The Unix and Internet Fundamentals HOWTO</TH
></TR
><TR
><TD
WIDTH="10%"
ALIGN="left"
VALIGN="bottom"
><A
HREF="core-formats.html"
ACCESSKEY="P"
>Prev</A
></TD
><TD
WIDTH="80%"
ALIGN="center"
VALIGN="bottom"
></TD
><TD
WIDTH="10%"
ALIGN="right"
VALIGN="bottom"
><A
HREF="languages.html"
ACCESSKEY="N"
>Next</A
></TD
></TR
></TABLE
><HR
ALIGN="LEFT"
WIDTH="100%"></DIV
><DIV
CLASS="sect1"
><H1
CLASS="sect1"
><A
NAME="disk-layout"
></A
>10. How does my computer store things on disk?</H1
><P
>When you look at a hard disk under Unix, you see a tree of named
directories and files.  Normally you won't need to look any deeper than
that, but it does become useful to know what's going on underneath if you
have a disk crash and need to try to salvage files.  Unfortunately, there's
no good way to describe disk organization from the file level downwards, so
I'll have to describe it from the hardware up.</P
><DIV
CLASS="sect2"
><H2
CLASS="sect2"
><A
NAME="disk-lowlevel"
></A
>10.1. Low-level disk and file system structure</H2
><P
>The surface area of your disk, where it stores data, is divided up
something like a dartboard &#8212; into circular tracks which are then
pie-sliced into sectors.  Because tracks near the outer edge have more area
than those close to the spindle at the center of the disk, the outer tracks
have more sector slices in them than the inner ones.  Each sector (or
<I
CLASS="firstterm"
>disk block</I
>) has the same size, which under modern Unixes
is generally 1 binary K (1024 8-bit bytes).  Each disk block has a unique
address or <I
CLASS="firstterm"
>disk block number</I
>.</P
><P
>Unix divides the disk into <I
CLASS="firstterm"
>disk
partitions</I
>.  Each partition is a continuous span of
blocks that's used separately from any other partition, either as a file
system or as swap space.  The original reasons for partitions had to do
with crash recovery in a world of much slower and more error-prone disks;
the boundaries between them reduce the fraction of your disk likely to
become inaccessible or corrupted by a random bad spot on the disk.
Nowadays, it's more important that partitions can be declared read-only
(preventing an intruder from modifying critical system files) or shared
over a network through various means we won't discuss here.  The
lowest-numbered partition on a disk is often treated specially, as a
<I
CLASS="firstterm"
>boot partition</I
> where you can put a kernel to be
booted.</P
><P
>Each partition is either <I
CLASS="firstterm"
>swap
space</I
> (used
to implement <A
HREF="memory-management.html#vm"
>virtual memory</A
>) or a <A
NAME="filesystems"
></A
><I
CLASS="firstterm"
>file system</I
> used to hold files.  Swap-space partitions are
just treated as a linear sequence of blocks.  File systems, on the other
hand, need a way to map file names to sequences of disk blocks.  Because
files grow, shrink, and change over time, a file's data blocks will not be
a linear sequence but may be scattered all over its partition (from
wherever the operating system can find a free block when it needs
one).  This scattering effect is called
<I
CLASS="firstterm"
>fragmentation</I
>.</P
></DIV
><DIV
CLASS="sect2"
><H2
CLASS="sect2"
><A
NAME="filestructure"
></A
>10.2. File names and directories</H2
><P
>Within each file system, the mapping from names to blocks is handled
through a structure called an
<I
CLASS="firstterm"
>i-node</I
>.
There's a pool of these things near the <SPAN
CLASS="QUOTE"
>"bottom"</SPAN
>
(lowest-numbered blocks) of each file system (the very lowest ones are used
for housekeeping and labeling purposes we won't describe here).  Each
i-node describes one file.  File data blocks (including directories) live
above the i-nodes (in higher-numbered blocks). </P
><P
>Every i-node contains a list of the disk block numbers in the file it
describes.  (Actually this is a half-truth, only correct for small files,
but the rest of the details aren't important here.)  Note that the i-node
does <I
CLASS="firstterm"
>not</I
> contain the name of the file.</P
><P
>Names of files live in <I
CLASS="firstterm"
>directory
structures</I
>.  A directory structure just maps names to
i-node numbers.  This is why, in Unix, a file can have multiple true names
(or <I
CLASS="firstterm"
>hard links</I
>); they're just multiple directory entries that
happen to point to the same i-node.</P
></DIV
><DIV
CLASS="sect2"
><H2
CLASS="sect2"
><A
NAME="mount-points"
></A
>10.3. Mount points</H2
><P
>In the simplest case, your entire Unix file system lives in just one
disk partition.  While you'll see this arrangement on some small personal
Unix systems, it's unusual.  More typical is for it to be spread across
several disk partitions, possibly on different physical disks.  So, for
example, your system may have one small partition where the kernel lives, a
slightly larger one where OS utilities live, and a much bigger one where
user home directories live.</P
><P
>The only partition you'll have access to immediately after system
boot is your <I
CLASS="firstterm"
>root partition</I
>,
which is (almost always) the one you booted from.  It holds the root
directory of the file system, the top node from which everything else
hangs.</P
><P
>The other partitions in the system have to be attached to this root
in order for your entire, multiple-partition file system to be accessible.
About midway through the boot process, your Unix will make these non-root
partitions accessible.  It will
<I
CLASS="firstterm"
>mount</I
>
each one onto a directory on the root partition.</P
><P
>For example, if you have a Unix directory called
<TT
CLASS="filename"
>/usr</TT
>, it is probably a mount point to a partition that
contains many programs installed with your Unix but not required during
initial boot.</P
></DIV
><DIV
CLASS="sect2"
><H2
CLASS="sect2"
><A
NAME="iname"
></A
>10.4. How a file gets looked up</H2
><P
>Now we can look at the file system from the top down.  When you open
a file (such as, say,
<TT
CLASS="filename"
>/home/esr/WWW/ldp/fundamentals.xml</TT
>) here is what
happens:</P
><P
>Your kernel starts at the root of your Unix file system (in the root
partition).  It looks for a directory there called &#8216;home&#8217;.
Usually &#8216;home&#8217; is a mount point to a large user partition
elsewhere, so it will go there.  In the top-level directory structure of
that user partition, it will look for a entry called &#8216;esr&#8217; and
extract an i-node number.  It will go to that i-node, notice that its
associated file data blocks are a directory structure, and look up
&#8216;WWW&#8217;.  Extracting <EM
>that</EM
> i-node, it will go
to the corresponding subdirectory and look up &#8216;ldp&#8217;.  That will
take it to yet another directory i-node.  Opening that one, it will find an
i-node number for &#8216;fundamentals.xml&#8217;.  That i-node is not a
directory, but instead holds the list of disk blocks associated with the
file.</P
></DIV
><DIV
CLASS="sect2"
><H2
CLASS="sect2"
><A
NAME="permissions"
></A
>10.5. File ownership, permissions and security</H2
><P
>To keep programs from accidentally or
maliciously stepping on data they shouldn't, Unix has
<I
CLASS="firstterm"
>permission</I
>
features.  These were originally designed to support timesharing by
protecting multiple users on the same machine from each other, back in the
days when Unix ran mainly on expensive shared minicomputers.</P
><P
>In order to understand file permissions, you need to recall the
description of users and groups in the section <A
HREF="login.html"
>&#13;What happens when you log in?</A
>.  Each file has an owning user and an
owning group.  These are initially those of the file's creator; they can be
changed with the programs
chown(1) and
chgrp(1).</P
><P
>The basic permissions that can be associated with a file are
&#8216;read&#8217; (permission to read data from it), &#8216;write&#8217;
(permission to modify it) and &#8216;execute&#8217; (permission to run it
as a program).  Each file has three sets of permissions; one for its owning
user, one for any user in its owning group, and one for everyone else.  The
&#8216;privileges&#8217; you get when you log in are just the ability to do
read, write, and execute on those files for which the permission bits match
your user ID or one of the groups you are in, or files that have been made
accessible to the world.</P
><P
>To see how these may interact and how Unix displays them, let's look
at some file listings on a hypothetical Unix system.  Here's one:</P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><FONT
COLOR="#000000"
><PRE
CLASS="screen"
>&#13;snark:~$ ls -l notes
-rw-r--r--   1 esr      users         2993 Jun 17 11:00 notes
</PRE
></FONT
></TD
></TR
></TABLE
><P
>This is an ordinary data file.  The listing tells us that it's owned
by the user &#8216;esr&#8217; and was created with the owning group
&#8216;users&#8217;.  Probably the machine we're on puts every ordinary user in
this group by default; other groups you commonly see on timesharing
machines are &#8216;staff&#8217;, &#8216;admin&#8217;, or
&#8216;wheel&#8217; (for obvious reasons, groups are not very important on
single-user workstations or PCs).  Your Unix may use a different default
group, perhaps one named after your user ID.</P
><P
>The string &#8216;-rw-r--r--&#8217; represents the permission bits
for the file.  The very first dash is the position for the directory bit;
it would show &#8216;d&#8217; if the file were a directory, or would show
&#8216;l&#8217; if the file were a symbolic link.  After that, the first
three places are user permissions, the second three group permissions, and
the third are permissions for others (often called &#8216;world&#8217;
permissions).  On this file, the owning user &#8216;esr&#8217; may read or
write the file, other people in the &#8216;users&#8217; group may read it,
and everybody else in the world may read it.  This is a pretty typical set
of permissions for an ordinary data file.</P
><P
>Now let's look at a file with very different permissions.  This file
is GCC, the GNU C compiler. </P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><FONT
COLOR="#000000"
><PRE
CLASS="screen"
>&#13;snark:~$ ls -l /usr/bin/gcc
-rwxr-xr-x   3 root     bin         64796 Mar 21 16:41 /usr/bin/gcc
</PRE
></FONT
></TD
></TR
></TABLE
><P
>This file belongs to a user called &#8216;root&#8217; and a group
called &#8216;bin&#8217;; it can be written (modified) only by root, but
read or executed by anyone.  This is a typical ownership and set of
permissions for a pre-installed system command.  The &#8216;bin&#8217;
group exists on some Unixes to group together system commands (the name is
a historical relic, short for &#8216;binary&#8217;).  Your Unix might use a
&#8216;root&#8217; group instead (not quite the same as the &#8216;root'
user!).</P
><P
>The &#8216;root&#8217; user is the conventional name for numeric user
ID 0, a special, privileged account that can override all privileges.  Root
access is useful but dangerous; a typing mistake while you're logged in as
root can clobber critical system files that the same command executed from
an ordinary user account could not touch.</P
><P
>Because the root account is so powerful, access to it should be guarded
very carefully.  Your root password is the single most critical piece of
security information on your system, and it is what any crackers and
intruders who ever come after you will be trying to get.</P
><P
>About passwords: Don't write them down &#8212; and don't pick a
passwords that can easily be guessed, like the first name of your
girlfriend/boyfriend/spouse.  This is an astonishingly common bad practice
that helps crackers no end.  In general, don't pick any word in the
dictionary; there are programs called <I
CLASS="firstterm"
>dictionary
crackers</I
> that look for likely passwords by running through word
lists of common choices.  A good technique is to pick a combination
consisting of a word, a digit, and another word, such as
&#8216;shark6cider&#8217; or &#8216;jump3joy&#8217;; that will make the search
space too large for a dictionary cracker.  Don't use these examples, though
&#8212; crackers might expect that after reading this document and put them
in their dictionaries.</P
><P
>Now let's look at a third case:</P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><FONT
COLOR="#000000"
><PRE
CLASS="screen"
>&#13;snark:~$ ls -ld ~
drwxr-xr-x  89 esr      users          9216 Jun 27 11:29 /home2/esr
snark:~$
</PRE
></FONT
></TD
></TR
></TABLE
><P
>This file is a directory (note the &#8216;d&#8217; in the first
permissions slot).  We see that it can be written only by esr, but read and
executed by anybody else.</P
><P
>Read permission gives you the ability to list the directory &#8212; that
is, to see the names of files and directories it contains. Write permission
gives you the ability to create and delete files in the directory.  If you
remember that the directory includes a list of the names of the files and
subdirectories it contains, these rules will make sense.</P
><P
>Execute permission on a directory means you can get through the
directory to open the files and directories below it.  In effect, it gives
you permission to access the i-nodes in the directory.  A directory with
execute completely turned off would be useless.</P
><P
>Occasionally you'll see a directory that is world-executable but not
world-readable; this means a random user can get to files and directories
beneath it, but only by knowing their exact names (the directory cannot be
listed).</P
><P
>It's important to remember that read, write, or execute permission on a
directory is independent of the permissions on the files and directories
beneath.  In particular, write access on a directory means you can
create new files or delete existing files there, but does not
automatically give you write access to existing files.</P
><P
>Finally, let's look at the permissions of the login program
itself.</P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><FONT
COLOR="#000000"
><PRE
CLASS="screen"
>&#13;snark:~$ ls -l /bin/login
-rwsr-xr-x   1 root     bin         20164 Apr 17 12:57 /bin/login
</PRE
></FONT
></TD
></TR
></TABLE
><P
>This has the permissions we'd expect for a system command &#8212;
except for that &#8216;s&#8217; where the owner-execute bit ought to be.
This is the visible manifestation of a special permission called the
&#8216;set-user-id&#8217; or <I
CLASS="firstterm"
>setuid
bit</I
>.</P
><P
>The setuid bit is normally attached to programs that need to give
ordinary users the privileges of root, but in a controlled way.  When it is
set on an executable program, you get the privileges of the owner of that
program file while the program is running on your behalf, whether or not
they match your own.</P
><P
>Like the root account itself, setuid programs are useful but
dangerous.  Anyone who can subvert or modify a setuid program owned by root
can use it to spawn a shell with root privileges.  For this reason, opening
a file to write it automatically turns off its setuid bit on most Unixes.
Many attacks on Unix security try to exploit bugs in setuid programs in
order to subvert them.  Security-conscious system administrators are
therefore extra-careful about these programs and reluctant to install new
ones.</P
><P
>There are a couple of important details we glossed over when
discussing permissions above; namely, how the owning group and permissions
are assigned when a file or directory is first created.  The group is an
issue because users can be members of multiple groups, but one of them
(specified in the user's <TT
CLASS="filename"
>/etc/passwd</TT
> entry) is the
user's <I
CLASS="firstterm"
>default group</I
> and will normally own files created by the
user.</P
><P
>The story with initial permission bits is a little more complicated.
A program that creates a file will normally specify the permissions it is
to start with.  But these will be modified by a variable in the user's
environment called the
<I
CLASS="firstterm"
>umask</I
>.
The umask specifies which permission bits to <EM
>turn off</EM
>
when creating a file; the most common value, and the default on most
systems, is -------w- or 002, which turns off the world-write bit.  See the
documentation of the umask command on your shell's manual page for
details.</P
><P
>Initial directory group is also a bit complicated.  On some Unixes a new
directory gets the default group of the creating user (this in the System V
convention); on others, it gets the owning group of the parent directory
in which it's created (this is the BSD convention).  On some modern Unixes,
including Linux, the latter behavior can be selected by setting the
set-group-ID on the directory (chmod g+s).</P
></DIV
><DIV
CLASS="sect2"
><H2
CLASS="sect2"
><A
NAME="fsck"
></A
>10.6. How things can go wrong</H2
><P
>Earlier it was hinted that file systems can be fragile things.
Now we know that to get to a file you have to hopscotch through what may be
an arbitrarily long chain of directory and i-node references.  Now suppose
your hard disk develops a bad spot?</P
><P
>If you're lucky, it will only trash some file data.  If you're
unlucky, it could corrupt a directory structure or i-node number and leave
an entire subtree of your system hanging in limbo &#8212; or, worse, result
in a corrupted structure that points multiple ways at the same disk block
or i-node.  Such corruption can be spread by normal file operations,
trashing data that was not in the original bad spot.</P
><P
>Fortunately, this kind of contingency has become quite uncommon as disk
hardware has become more reliable.  Still, it means that your Unix will
want to integrity-check the file system periodically to make sure nothing
is amiss.  Modern Unixes do a fast integrity check on each partition at
boot time, just before mounting it.  Every few reboots they'll do a much
more thorough check that takes a few minutes longer.</P
><P
>If all of this sounds like Unix is terribly complex and
failure-prone, it may be reassuring to know that these boot-time checks
typically catch and correct normal problems <EM
>before</EM
>
they become really disastrous.  Other operating systems don't have these
facilities, which speeds up booting a bit but can leave you much more
seriously screwed when attempting to recover by hand (and that's assuming
you have a copy of Norton Utilities or whatever in the first
place...).</P
><P
>One of the trends in current Unix designs is <I
CLASS="firstterm"
>journalling
file systems</I
>.  These arrange traffic to the disk so that
it's guaranteed to be in a consistent state that can be recovered when the
system comes back up. This will speed up the boot-time integrity check a
lot.</P
></DIV
></DIV
><DIV
CLASS="NAVFOOTER"
><HR
ALIGN="LEFT"
WIDTH="100%"><TABLE
SUMMARY="Footer navigation table"
WIDTH="100%"
BORDER="0"
CELLPADDING="0"
CELLSPACING="0"
><TR
><TD
WIDTH="33%"
ALIGN="left"
VALIGN="top"
><A
HREF="core-formats.html"
ACCESSKEY="P"
>Prev</A
></TD
><TD
WIDTH="34%"
ALIGN="center"
VALIGN="top"
><A
HREF="index.html"
ACCESSKEY="H"
>Home</A
></TD
><TD
WIDTH="33%"
ALIGN="right"
VALIGN="top"
><A
HREF="languages.html"
ACCESSKEY="N"
>Next</A
></TD
></TR
><TR
><TD
WIDTH="33%"
ALIGN="left"
VALIGN="top"
>How does my computer store things in memory?</TD
><TD
WIDTH="34%"
ALIGN="center"
VALIGN="top"
>&nbsp;</TD
><TD
WIDTH="33%"
ALIGN="right"
VALIGN="top"
>How do computer languages work?</TD
></TR
></TABLE
></DIV
></BODY
></HTML
>