old-www/LDP/LG/issue93/tag/2.html

529 lines
19 KiB
HTML

<!--startcut ==============================================-->
<!-- *** BEGIN HTML header *** -->
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML><HEAD>
<META NAME="generator" CONTENT="lgazmail v1.4G.j">
<TITLE>The Answer Gang 93: hard links</TITLE>
</HEAD><BODY BGCOLOR="#FFFFFF" TEXT="#000000" LINK="#0000FF" VLINK="#0000AF"
ALINK="#FF0000">
<!-- *** END HTML header *** -->
<!--endcut ==============================================-->
<!-- begin 2 -->
<H3 align="left"><img src="../../gx/dennis/qbubble.gif"
height="50" width="60" alt="(?) " border="0"
>hard links</H3>
<p><strong>From Kathy
</strong></p>
<p></strong></p>
<p align="right"><strong>Answered By: Jason Creigton, Faber Fedor, Neil Youngman, Jim Dennis, Jay R. Ashworth, Ben Okopnik, Thomas Adam
</strong></p>
<P><STRONG>
<IMG SRC="../../gx/dennis/qbub.gif" ALT="(?)"
HEIGHT="28" WIDTH="50" BORDER="0"
>
I'm confused, if Linux doesn't allow directory hard links then why does
every Linux directory have at least two hard links?
</STRONG></P>
<P><STRONG>
Thanks,
Kathy
</STRONG></P>
<blockQuote>
<IMG SRC="../../gx/dennis/bbub.gif" ALT="(!)"
HEIGHT="28" WIDTH="50" BORDER="0"
> [Jason]
Not exactly sure....but <A HREF="../..//issue35/tag/links.html"
>http://www.linuxgazette.com/issue35/tag/links.html</A> makes for good reading.
</blockQuote>
<blockQuote>
<IMG SRC="../../gx/dennis/bbub.gif" ALT="(!)"
HEIGHT="28" WIDTH="50" BORDER="0"
> [Faber]
You know, I'm confused too! Looking into it a little bit, it seems that
whether or not directory hard links are allowed depends on the
underlying file system. Fire systems that are of type vxfs (no, I don't
know what that means either
<IMG SRC="../../gx/dennis/smily.gif" ALT=":-)"
height="24" width="20" align="middle"> don't allow the creation of directory
hard links. I've yet to discover why.
</blockQuote>
<blockQuote>
The reason we have them in Linux (. and ..) is, I always assumed, so we
have a way to travel up the directory tree (cd ..), otherwise the system
would need to know the name of the parent directory (as opposed to just
its inode).
</blockQuote>
<blockQuote>
Why is . a hard link to the current directory then? &lt;shrug&gt; Because un*x
people are lazy typists?
</blockQuote>
<blockQuote>
A very interesting question, BTW. I'm interested in finding the answer
to it myself.
</blockQuote>
<blockQuote>
<IMG SRC="../../gx/dennis/bbub.gif" ALT="(!)"
HEIGHT="28" WIDTH="50" BORDER="0"
> [Jim]
</blockQuote>
<blockQuote>
The system uses hard links to manage the link from the parent to the
directory's inode, the . link in that directory and all of the ..
entries in all of its subdirectories.
</blockQuote>
<blockQuote>
USER'S (including 'root') are forbidden to create additional hard
links because this would make fsck much more difficult to implement
and it might allow one to create hard link loops, and dangling
sub trees.
</blockQuote>
<blockQuote>
Basically the directory linkages are maintained by the filesystem to
glue the whole tree together, to ensure that it is truly an acyclic
tree structure with a single root.
</blockQuote>
<blockQuote>
In other words it's a policy enforced by the kernel. Some other UNIX
systems have allowed root to create hardlinked directories; and it
could be done with a disk editor like LDE under Linux (though I'd expect
fsck to complain the next time it was run --- and if you did something
degenerate you might cause some interesting problems --- possibly even
cause the kernel to treat the fs as corrupt and invoke it's handler
(remount,ro, panic, or continue) and possibly even cause some kernel
threads to run amok or panic the system.
</blockQuote>
<blockQuote>
<IMG SRC="../../gx/dennis/bbub.gif" ALT="(!)"
HEIGHT="28" WIDTH="50" BORDER="0"
> [Neil]
</blockQuote>
<blockQuote>
Traditionally, in Unix systems a file or directory is physically deleted from
the disk when there are no hard links to it. The rm and rmdir commands
command remove a directory entry (link). If there are more than one hard
links to a file or directory, the file remains, so although we regard the rm
command as deleting a file, it only deletes the link to the file. When there
are no hard links to a file or directory, the file system will then free up
the actual space used by the file. There have to be hard links to directories
or they would be deleted by the filesystem.
</blockQuote>
<blockQuote>
ISTR that hard links to directories can only be created by mkdir to ensure
that we can't build up cyclic directory structures, otherwise programs such
as find, which traverse the directory could loop over the same directory
structure for ever.
</blockQuote>
<blockQuote>
In conclusion, Linux does allow hard links to directories, but it only allows
hard links to a directory from itself and it's parent directory. These are
the two hard links to which you refer.
</blockQuote>
<blockQuote>
<IMG SRC="../../gx/dennis/bbub.gif" ALT="(!)"
HEIGHT="28" WIDTH="50" BORDER="0"
> [Neil]
Some ambiguity there. If there are more then one links before the rm command,
there will be at least one after the rm command, so the file space isn't
freed. Of course rmdir deletes both links to a directory.
</blockQuote>
<blockQuote>
<IMG SRC="../../gx/dennis/bbub.gif" ALT="(!)"
HEIGHT="28" WIDTH="50" BORDER="0"
> [Jason]
</blockQuote>
<blockQuote>
Okay, I've looked into this more: It appears that, for some reason or another
(Another Gang member will no doubt know why) it's a Bad Idea to make hard
links with directories. Look here:
</blockQuote>
<blockquote><pre>root:~# ln lala foo
ln: `lala': hard link not allowed for directory
root:~# strace ln lala foo
execve("/bin/ln", ["ln", "lala", "foo"], [/* 17 vars */]) = 0
uname({sys="Linux", node="jpc.example.com", ...}) = 0
brk(0) = 0x804db0c
open("/etc/ld.so.preload", O_RDONLY) = -1 ENOENT (No such file or
directory)
open("/etc/ld.so.cache", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=19148, ...}) = 0
mmap2(NULL, 19148, PROT_READ, MAP_PRIVATE, 3, 0) = 0x40014000
close(3) = 0
open("/lib/libc.so.6", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\20\\\1"..., 1024) =
1024
fstat64(3, {st_mode=S_IFREG|0755, st_size=1494904, ...}) = 0
mmap2(NULL, 1256324, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x40019000
mprotect(0x40144000, 31620, PROT_NONE) = 0
mmap2(0x40144000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3,
0x12a) = 0x40144000
mmap2(0x4014a000, 7044, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x4014a000
close(3) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
0x4014c000
munmap(0x40014000, 19148) = 0
stat64("foo", 0xbffffc10) = -1 ENOENT (No such file or
directory)
lstat64("lala", {st_mode=S_IFDIR|0755, st_size=48, ...}) = 0
write(2, "ln: ", 4ln: ) = 4
write(2, "`lala\': hard link not allowed fo"..., 43`lala': hard link not
allowed for directory) = 43
write(2, "\n", 1
) = 1
exit_group(1) = ?
</pre></blockquote>
<blockQuote>
Notice, that, in the strace output, <TT> link()</TT> doesn't actually get called. So is
this 'ln' just trying to save us from outselves, or is the kernel or glibc? I
wrote this quick C program:
</blockQuote>
<p align="center">See attached <tt><a href="../misc/tag/creighton.c-link.c.txt">creighton.c-link.c.txt</a></tt></p>
<blockquote><pre>root:~# link=/home/jason/prog/c/link
root:~# $link lala foo
Error while linking: Operation not permitted
root:~# strace $link lala foo
&lt;sniped syscall trace that looks very similar to ln's strace output....the
important line is:
link("lala", "foo") = -1 EPERM (Operation not permitted)
&lt;sniped more&gt;
root:~#
</pre></blockquote>
<blockQuote>
So, 'ln' sees that you're trying to hardlink directories and doesn't even
attempt it, instead giving a useful error message. And since we see the <TT> link()</TT>
syscall being proformed, it means that the kernel doesn't allow hard linking
of directories, and it's not the glibc wrapper that refuses to hardlink
directories. (If it had been glibc, we would not have even seen <TT> link()</TT> being
called: the <TT> link()</TT> in glibc would have returned without calling the <TT> link()</TT>
syscall.)
</blockQuote>
<blockQuote>
Now, back to your original question: I have no idea why creating hard links to
directories is a bad idea. (It must be, or Linux would allow root to do it.)
LG #35 Answer Guy column has something about this: (Quoting from the article I
linked to in my other email)
</blockQuote>
<blockQuote>
&lt;quote&gt;
</blockQuote>
<blockQuote>
Some versions of Unix have historically allowed root (superuser) to create
hard links to directories --- but the GNU utilities under Linux won't allow it
--- so you'd have to write your own code or you'd have to directly modify the
fs with a hex editor
</blockQuote>
<blockQuote>
&lt;end quote&gt;
</blockQuote>
<blockQuote><CODE>
Well, obviously, it's the kernel disallowing it, not GNU utilites. However, LG
<BR>#35 was some time ago, so things might have been different then.
</CODE></blockQuote>
<blockQuote>
<IMG SRC="../../gx/dennis/bbub.gif" ALT="(!)"
HEIGHT="28" WIDTH="50" BORDER="0"
> [jra]
</blockQuote>
<blockQuote>
User-added hardlinks to directories are forbidden because they break the
directed acyclic graph structure of the filesystem (which is an ASSERT in
Unixiana, roughly), and because they confuse the <EM>hell</EM> out of
file-tree-walkers (a term Multicians will recognize at sight, but Unix
geeks can probably figure out without problems too.
</blockQuote>
<blockQuote>
(Did I get that right, Ben?
<IMG SRC="../../gx/dennis/smily.gif" ALT=":-)"
height="24" width="20" align="middle">
</blockQuote>
<blockQuote>
And indeed, anyone who's ever done
</blockQuote>
<blockQuote><CODE>
# rm -rf .*
</CODE></blockQuote>
<blockQuote>
in a user's home directory to clear out all the dotfiles prior to
dropping the user will no doubt understand why even the system 3 links
to a directory (. in ., .. in children, and the named link in the
parent) are often 2 too many.
</blockQuote>
<blockQuote>
<IMG SRC="../../gx/dennis/bbub.gif" ALT="(!)"
HEIGHT="28" WIDTH="50" BORDER="0"
> [Jason]
</blockQuote>
<blockQuote>
Ouch! Never thought about that, I'll have to remember that....
</blockQuote>
<blockQuote>
<IMG SRC="../../gx/dennis/bbub.gif" ALT="(!)"
HEIGHT="28" WIDTH="50" BORDER="0"
> [Jason]
</blockQuote>
<blockQuote>
Yes, I wrote that before I got to read the rest of the thread. With symlinks,
at least it's easy to tell when there's a loop. (BTW, I seem to recall an
option in Wine to ignore symlinks because they causes some Windows programs to
get very, very confused.)
</blockQuote>
<blockquote><pre>~/tmp$ ln -s file1 file2
~/tmp$ ln -s file2 file1
~/tmp$ ls -l file*
lrwxrwxrwx 1 jason users 5 Jul 20 16:59 file1 -&gt; file2
lrwxrwxrwx 1 jason users 5 Jul 20 16:59 file2 -&gt; file1
~/tmp$ cat file1
cat: file1: Too many levels of symbolic links
~/tmp$ strace -e trace=open cat file1
open("/etc/ld.so.preload", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY) = 3
open("/lib/libc.so.6", O_RDONLY) = 3
open("file1", O_RDONLY|O_LARGEFILE) = -1 ELOOP (Too many levels of symbolic links)
cat: file1: Too many levels of symbolic links
</pre></blockquote>
<blockQuote>
<IMG SRC="../../gx/dennis/bbub.gif" ALT="(!)"
HEIGHT="28" WIDTH="50" BORDER="0"
> [jra]
User-added hardlinks to directories are forbidden because they break the
directed acyclic graph structure of the filesystem (which is an ASSERT in
Unixiana, roughly), and because they confuse the <EM>hell</EM> out of
file-tree-walkers (a term Multicians will recognize at sight, but Unix
geeks can probably figure out without problems too.
</blockQuote>
<blockQuote>
<IMG SRC="../../gx/dennis/bbub.gif" ALT="(!)"
HEIGHT="28" WIDTH="50" BORDER="0"
> [Jason]
I just thought of something else:
</blockQuote>
<blockQuote><BLOCKQuote>
Disk space management and memory management are the same thing.
</BLOCKQuote></blockQuote>
<blockQuote>
UNIX has chosen reference counting for disk space management. Reference
counting can't deal with cyclic (Right word? I mean data structures that refer
to themselves.) data structures, and thus hardlinking directories is
disallowed. If Linux used garbage collection, it would be okay to hardlink
directories, if very confusing.
</blockQuote>
<blockQuote>
But using GC on filesystems would be slow and offer no real advantages, so
reference counting is okay.
</blockQuote>
<P><STRONG>
<IMG SRC="../../gx/dennis/qbub.gif" ALT="(?)"
HEIGHT="28" WIDTH="50" BORDER="0"
>
Well, root must be able to create hard links,
because of the option ln --directory (-d, -F).
</STRONG></P>
<blockQuote>
<IMG SRC="../../gx/dennis/bbub.gif" ALT="(!)"
HEIGHT="28" WIDTH="50" BORDER="0"
> [Jason]
Try it:
</blockQuote>
<blockquote><pre>root:~# mkdir dir1
root:~# ln -d dir1 dir2
ln: creating hard link `dir2' to `dir1': Operation not permitted
</pre></blockquote>
<HR width="10%" align="left"><P><STRONG>
<IMG SRC="../../gx/dennis/qbub.gif" ALT="(?)"
HEIGHT="28" WIDTH="50" BORDER="0"
> [Thomas]
Then in the same thead....
</STRONG></P>
<P><STRONG>
<IMG SRC="../../gx/dennis/qbub.gif" ALT="(?)"
HEIGHT="28" WIDTH="50" BORDER="0"
> [jra]
And indeed, anyone who's ever done:
</STRONG></P>
<pre><strong># rm -rf .*
</strong></pre>
<blockQuote>
<IMG SRC="../../gx/dennis/bbub.gif" ALT="(!)"
HEIGHT="28" WIDTH="50" BORDER="0"
> [Jim]
</blockQuote>
<blockQuote>
The GNU version of 'rm' will refuse to attempt recursion into or
unlinking of . and/or .. entries.
</blockQuote>
<blockQuote>
However this is still a classic sysadmin tech question. It's almost
as common as: "How do I remove a file named -fr?"
</blockQuote>
<blockQuote>
<IMG SRC="../../gx/dennis/bbub.gif" ALT="(!)"
HEIGHT="28" WIDTH="50" BORDER="0"
> [Ben]
</blockQuote>
<blockQuote>
rm -- -fr
rm <TT>./-fr</TT>
perl -we'unlink "-fr"'
"F8" in Midnight Commander
<IMG SRC="../../gx/dennis/smily.gif" ALT=":)"
height="24" width="20" align="middle">
</blockQuote>
<blockQuote>
<IMG SRC="../../gx/dennis/bbub.gif" ALT="(!)"
HEIGHT="28" WIDTH="50" BORDER="0"
> [Thomas]
</blockQuote>
<blockQuote>
You forgot to mention using Emacs.... You also didn't mention jettisoning
the disk into space...
</blockQuote>
<blockQuote>
<IMG SRC="../../gx/dennis/bbub.gif" ALT="(!)"
HEIGHT="28" WIDTH="50" BORDER="0"
> [Jim]
</blockQuote>
<blockQuote>
How do you SAFELY remove all the dot files and dot directories under
the current directory?
</blockQuote>
<blockQuote>
My best answer under Bash is:
</blockQuote>
<blockquote><pre> rm -fr .??* .[^.]
</pre></blockquote>
<blockQuote>
<IMG SRC="../../gx/dennis/bbub.gif" ALT="(!)"
HEIGHT="28" WIDTH="50" BORDER="0"
> [Ben]
</blockQuote>
<blockquote><pre>rm -rf .[^.]*
</pre></blockquote>
<blockQuote>
is what I've always used; this would, of course, ignore
files named "..." and so on, but that's not much of an issue in
practice.
</blockQuote>
<blockQuote>
<IMG SRC="../../gx/dennis/bbub.gif" ALT="(!)"
HEIGHT="28" WIDTH="50" BORDER="0"
> [Jim]
</blockQuote>
<blockQuote>
... this gets anything starting with a dot and followed by at least
two more characters (thus . and .. won't match) and also it gets
anything starting with a dot and followed by any single character
<EM>other</EM> than a dot (thus getting such unlikely entries as .a .\?
etc). This pair of glob patterns should match every possible dot
entry except . and ..
</blockQuote>
<blockQuote>
However, I preface it with <EM>under bash</EM>. I happen to know it will
work under ash, zsh, tcsh, and most other modern shells. However,
if I was on a particularly old shell I'd have to check. The negated
character class might not have been in the glob libraries of earliest
Bourne and C shells.
</blockQuote>
<blockQuote>
If I really had to write a script for maximum portability:
</blockQuote>
<blockquote><pre> rm -fr .??*; rm -fr `echo .? | grep -v '\.\.' `
</pre></blockquote>
<blockQuote>
... remove all the longer dot entries in the obvious way, then let
echo match all the two character dot entries and grep out the ..
entry explicitly.
</blockQuote>
<blockQuote>
<IMG SRC="../../gx/dennis/bbub.gif" ALT="(!)"
HEIGHT="28" WIDTH="50" BORDER="0"
> [Ben]
Other interesting situations come up when you want to delete a file
named in a foreign language. I've run into a Russian song name that even
Midnight Commander couldn't handle. Cutting and pasting to "rm" didn't
help either (clearly, some of the characters were the "escaped" types,
but I had no idea which ones - long story short, the machine didn't have
any Russian fonts on it.) Even "ls -b" failed for the above, for whatever
reason. I ended up doing something like
</blockQuote>
<blockQuote><CODE>
perl -wle'/match/&amp;&amp;print for &lt;*&gt;'
</CODE></blockQuote>
<blockQuote>
where "match" was a small substring of the characters in the name.
Needless to say, "print" became "unlink" when I saw that only the
appropriate file matched.
</blockQuote>
<!-- end 2 -->
<!-- *** BEGIN copyright *** -->
<hr>
<CENTER><SMALL><STRONG>
<h5>
<br>Copyright &copy; 2003
<br>Copying license <A HREF="">http://www.linuxgazette.com/copying.html</A>
<BR>Published in Issue 93 of <i>Linux Gazette</i>, August 2003</H5>
</STRONG></SMALL></CENTER>
<!-- *** END copyright *** -->
<SMALL><CENTER><H6 ALIGN="center">HTML script maintained by
<A HREF="mailto:star@starshine.org">Heather Stern</a> of
Starshine Technical Services,
<A HREF="http://www.starshine.org/">http://www.starshine.org/</A>
</H6></SMALL></CENTER>
<HR>
<!--startcut ======================================================= -->
<P> <hr>
<!-- begin tagnav ::::::::::::::::::::::::::::::::::::::::::::::::::-->
<p align="center">
<table width="100%" border="0"><tr>
<td align="right" valign="center"
><IMG ALT="" SRC="../../gx/navbar/left.jpg"
WIDTH="14" HEIGHT="45" BORDER="0" ALIGN="middle" border="0"
><A HREF="../index.html"
><IMG SRC="../../gx/navbar/toc.jpg" align="middle"
ALT="[ Table Of Contents ]" border="0"></A
><A HREF="../lg_answer.html"
><IMG SRC="../../gx/dennis/answertoc.jpg" align="middle"
ALT="[ Answer Guy Current Index ]" border="0"></A></td>
<td align="center" valign="center"><A HREF="../lg_answer.html#greeting"><img align="middle"
src="../../gx/dennis/smily.gif" alt="greetings" border="0"></A> &nbsp;
<A HREF="../../tag/bios.html">Meet&nbsp;the&nbsp;Gang</A> &nbsp;
<A HREF="1.html">1</A> &nbsp;
<A HREF="2.html">2</A> &nbsp;
<A HREF="3.html">3</A> &nbsp;
<A HREF="4.html">4</A> &nbsp;
<A HREF="5.html">5</A> &nbsp;
<A HREF="6.html">6</A>
</td>
<td align="left" valign="center"><A HREF="../../tag/kb.html"
><IMG SRC="../../gx/dennis/answerpast.jpg" align="middle"
ALT="[ Index of Past Answers ]" border="0"></A
><IMG ALT="" SRC="../../gx/navbar/right.jpg" align="middle"
WIDTH="14" HEIGHT="45" BORDER="0"></td></tr></table>
</p>
<!-- end tagnav ::::::::::::::::::::::::::::::::::::::::::::::::::::-->
<!--endcut ========================================================= -->
<P> <hr>
<!--startcut ======================================================= -->
<CENTER>
<!-- *** BEGIN navbar *** -->
<!-- *** END navbar *** -->
</CENTER>
</p>
<!--endcut ========================================================= -->
<!--startcut ======================================================= -->
</BODY></HTML>
<!--endcut ========================================================= -->