old-www/LDP/intro-linux/html/sect_09_01.html

925 lines
16 KiB
HTML

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML
><HEAD
><TITLE
>Introduction</TITLE
><META
NAME="GENERATOR"
CONTENT="Modular DocBook HTML Stylesheet Version 1.7"><LINK
REL="HOME"
TITLE="Introduction to Linux"
HREF="index.html"><LINK
REL="UP"
TITLE="Fundamental Backup Techniques"
HREF="chap_09.html"><LINK
REL="PREVIOUS"
TITLE="Fundamental Backup Techniques"
HREF="chap_09.html"><LINK
REL="NEXT"
TITLE="Moving your data to a backup device"
HREF="sect_09_02.html"></HEAD
><BODY
CLASS="sect1"
BGCOLOR="#FFFFFF"
TEXT="#000000"
LINK="#0000FF"
VLINK="#840084"
ALINK="#0000FF"
><DIV
CLASS="NAVHEADER"
><TABLE
SUMMARY="Header navigation table"
WIDTH="100%"
BORDER="0"
CELLPADDING="0"
CELLSPACING="0"
><TR
><TH
COLSPAN="3"
ALIGN="center"
>Introduction to Linux: </TH
></TR
><TR
><TD
WIDTH="10%"
ALIGN="left"
VALIGN="bottom"
><A
HREF="chap_09.html"
ACCESSKEY="P"
>Prev</A
></TD
><TD
WIDTH="80%"
ALIGN="center"
VALIGN="bottom"
>Chapter 9. Fundamental Backup Techniques</TD
><TD
WIDTH="10%"
ALIGN="right"
VALIGN="bottom"
><A
HREF="sect_09_02.html"
ACCESSKEY="N"
>Next</A
></TD
></TR
></TABLE
><HR
ALIGN="LEFT"
WIDTH="100%"></DIV
><DIV
CLASS="sect1"
><H1
CLASS="sect1"
><A
NAME="sect_09_01"
></A
>9.1. Introduction</H1
><P
>Although Linux is one of the safest operating systems in existence, and even if it is designed to keep on going, data can get lost. Data loss is most often the consequence of user errors, but occasionally a system fault, such as a power or disk failure, is the cause, so it's always a good idea to keep an extra copy of sensitive and/or important data.</P
><DIV
CLASS="sect2"
><H2
CLASS="sect2"
><A
NAME="sect_09_01_01"
></A
>9.1.1. Preparing your data</H2
><DIV
CLASS="sect3"
><H3
CLASS="sect3"
><A
NAME="sect_09_01_01_01"
></A
>9.1.1.1. Archiving with tar</H3
><P
>In most cases, we will first collect all the data to back up in a single archive file, which we will compress later on. The process of archiving involves concatenating all listed files and taking out unnecessary blanks. In Linux, this is commonly done with the <B
CLASS="command"
>tar</B
> command. <B
CLASS="command"
>tar</B
> was originally designed to archive data on tapes, but it can also make archives, known as <EM
>tarballs</EM
>.</P
><P
><B
CLASS="command"
>tar</B
> has many options, the most important ones are cited below:</P
><P
></P
><UL
><LI
><P
><TT
CLASS="option"
>-v</TT
>: verbose</P
></LI
><LI
><P
><TT
CLASS="option"
>-t</TT
>: test, shows content of a tarball</P
></LI
><LI
><P
><TT
CLASS="option"
>-x</TT
>: extract archive</P
></LI
><LI
><P
><TT
CLASS="option"
>-c</TT
>: create archive</P
></LI
><LI
><P
><TT
CLASS="option"
>-f</TT
> <TT
CLASS="filename"
>archivedevice</TT
>: use <TT
CLASS="filename"
>archivedevice</TT
> as source/destination for the tarball, the device defaults to the first tape device (usually <TT
CLASS="filename"
>/dev/st0</TT
> or something similar)</P
></LI
><LI
><P
><TT
CLASS="option"
>-j</TT
>: filter through <B
CLASS="command"
>bzip2</B
>, see <A
HREF="sect_09_01.html#sect_09_01_01_02"
>Section 9.1.1.2</A
></P
></LI
></UL
><P
>It is common to leave out the dash-prefix with <B
CLASS="command"
>tar</B
> options, as you can see from the examples below.</P
><DIV
CLASS="note"
><P
></P
><TABLE
CLASS="note"
WIDTH="100%"
BORDER="0"
><TR
><TD
WIDTH="25"
ALIGN="CENTER"
VALIGN="TOP"
><IMG
SRC="../images/note.gif"
HSPACE="5"
ALT="Note"></TD
><TH
ALIGN="LEFT"
VALIGN="CENTER"
><B
>Use GNU tar for compatibility</B
></TH
></TR
><TR
><TD
>&nbsp;</TD
><TD
ALIGN="LEFT"
VALIGN="TOP"
><P
>The archives made with a proprietary <B
CLASS="command"
>tar</B
> version on one system, may be incompatible with <B
CLASS="command"
>tar</B
> on another proprietary system. This may cause much headaches, such as if the archive needs to be recovered on a system that doesn't exist anymore. Use the GNU <B
CLASS="command"
>tar</B
> version on all systems to prevent your system admin from bursting into tears. Linux always uses GNU tar. When working on other UNIX machines, enter <B
CLASS="command"
>tar <TT
CLASS="option"
>--help</TT
></B
> to find out which version you are using. Contact your system admin if you don't see the word GNU somewhere.</P
></TD
></TR
></TABLE
></DIV
><P
>In the example below, an archive is created and unpacked.</P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><FONT
COLOR="#000000"
><PRE
CLASS="screen"
>&#13;<TT
CLASS="prompt"
>gaby:~&#62;</TT
> <B
CLASS="command"
>ls images/</B
>
me+tux.jpg nimf.jpg
<TT
CLASS="prompt"
>gaby:~&#62;</TT
> <B
CLASS="command"
>tar cvf images-in-a-dir.tar images/</B
>
images/
images/nimf.jpg
images/me+tux.jpg
<TT
CLASS="prompt"
>gaby:~&#62;</TT
> <B
CLASS="command"
>cd images</B
>
<TT
CLASS="prompt"
>gaby:~/images&#62;</TT
> <B
CLASS="command"
>tar cvf images-without-a-dir.tar *.jpg</B
>
me+tux.jpg
nimf.jpg
<TT
CLASS="prompt"
>gaby:~/images&#62;</TT
> <B
CLASS="command"
>cd</B
>
<TT
CLASS="prompt"
>gaby:~&#62;</TT
> <B
CLASS="command"
>ls */*.tar</B
>
images/images-without-a-dir.tar
<TT
CLASS="prompt"
>gaby:~&#62;</TT
> <B
CLASS="command"
>ls *.tar</B
>
images-in-a-dir.tar
<TT
CLASS="prompt"
>gaby:~&#62;</TT
> <B
CLASS="command"
>tar xvf images-in-a-dir.tar </B
>
images/
images/nimf.jpg
images/me+tux.jpg
<TT
CLASS="prompt"
>gaby:~&#62;</TT
> <B
CLASS="command"
>tar tvf images/images-without-dir.tar </B
>
-rw-r--r-- gaby/gaby 42888 1999-06-30 20:52:25 me+tux.jpg
-rw-r--r-- gaby/gaby 7578 2000-01-26 12:58:46 nimf.jpg
<TT
CLASS="prompt"
>gaby:~&#62;</TT
> <B
CLASS="command"
>tar xvf images/images-without-a-dir.tar </B
>
me+tux.jpg
nimf.jpg
<TT
CLASS="prompt"
>gaby:~&#62;</TT
> <B
CLASS="command"
>ls *.jpg</B
>
me+tux.jpg nimf.jpg
</PRE
></FONT
></TD
></TR
></TABLE
><P
>This example also illustrates the difference between a tarred directory and a bunch of tarred files. It is advisable to only compress directories, so files don't get spread all over when unpacking the tarball (which may be on another system, where you may not know which files were already there and which are the ones from the archive).</P
><P
>When a tape drive is connected to your machine and configured by your system administrator, the file names ending in <TT
CLASS="filename"
>.tar</TT
> are replaced with the tape device name, for example:</P
><P
><B
CLASS="command"
>tar <TT
CLASS="option"
>cvf</TT
> <TT
CLASS="filename"
>/dev/tape</TT
> <TT
CLASS="filename"
>mail/</TT
></B
></P
><P
>The directory <TT
CLASS="filename"
>mail</TT
> and all the files it contains are compressed into a file that is written on the tape immediately. A content listing is displayed because we used the verbose option.</P
></DIV
><DIV
CLASS="sect3"
><H3
CLASS="sect3"
><A
NAME="sect_09_01_01_02"
></A
>9.1.1.2. Incremental backups with tar</H3
><P
>The <B
CLASS="command"
>tar</B
> tool supports the creation of incremental backups, using the <TT
CLASS="option"
>-N</TT
> option. With this option, you can specify a date, and <B
CLASS="command"
>tar</B
> will check modification time of all specified files against this date. If files are changed more recent than date, they will be included in the backup. The example below uses the timestamp on a previous archive as the date value. First, the initial archive is created and the timestamp on the initial backup file is shown. Then a new file is created, upon which we take a new backup, containing only this new file:</P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><FONT
COLOR="#000000"
><PRE
CLASS="screen"
>&#13;<TT
CLASS="prompt"
>jimmy:~&#62;</TT
> <B
CLASS="command"
>tar cvpf /var/tmp/javaproggies.tar java/*.java</B
>
java/btw.java
java/error.java
java/hello.java
java/income2.java
java/income.java
java/inputdevice.java
java/input.java
java/master.java
java/method1.java
java/mood.java
java/moodywaitress.java
java/test3.java
java/TestOne.java
java/TestTwo.java
java/Vehicle.java
<TT
CLASS="prompt"
>jimmy:~&#62;</TT
> <B
CLASS="command"
>ls -l /var/tmp/javaproggies.tar</B
>
-rw-rw-r-- 1 jimmy jimmy 10240 Jan 21 11:58 /var/tmp/javaproggies.tar
<TT
CLASS="prompt"
>jimmy:~&#62;</TT
> <B
CLASS="command"
>touch java/newprog.java</B
>
<TT
CLASS="prompt"
>jimmy:~&#62;</TT
> <B
CLASS="command"
>tar -N /var/tmp/javaproggies.tar \
-cvp /var/tmp/incremental1-javaproggies.tar java/*.java 2&#62; /dev/null</B
>
java/newprog.java
<TT
CLASS="prompt"
>jimmy:~&#62;</TT
> <B
CLASS="command"
>cd /var/tmp/</B
>
<TT
CLASS="prompt"
>jimmy:~&#62;</TT
> <B
CLASS="command"
>tar xvf incremental1-javaproggies.tar</B
>
java/newprog.java
</PRE
></FONT
></TD
></TR
></TABLE
><P
>Standard errors are redirected to <TT
CLASS="filename"
>/dev/null</TT
>. If you don't do this, <B
CLASS="command"
>tar</B
> will print a message for each unchanged file, telling you it won't be dumped.</P
><P
>This way of working has the disadvantage that it looks at timestamps on files. Say that you download an archive into the directory containing your backups, and the archive contains files that have been created two years ago. When checking the timestamps of those files against the timestamp on the initial archive, the new files will actually seem old to <B
CLASS="command"
>tar</B
>, and will not be included in an incremental backup made using the <TT
CLASS="option"
>-N</TT
> option.</P
><P
>A better choice would be the <TT
CLASS="option"
>-g</TT
> option, which will create a list of files to backup. When making incremental backups, files are checked against this list. This is how it works:</P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><FONT
COLOR="#000000"
><PRE
CLASS="screen"
>&#13;<TT
CLASS="prompt"
>jimmy:~&#62;</TT
> <B
CLASS="command"
>tar cvpf work-20030121.tar -g snapshot-20030121 work/</B
>
work/
work/file1
work/file2
work/file3
<TT
CLASS="prompt"
>jimmy:~&#62;</TT
> <B
CLASS="command"
>file snapshot-20030121</B
>
snapshot-20030121: ASCII text
</PRE
></FONT
></TD
></TR
></TABLE
><P
>The next day, user <EM
>jimmy</EM
> works on <TT
CLASS="filename"
>file3</TT
> a bit more, and creates <TT
CLASS="filename"
>file4</TT
>. At the end of the day, he makes a new backup:</P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><FONT
COLOR="#000000"
><PRE
CLASS="screen"
>&#13;<TT
CLASS="prompt"
>jimmy:~&#62;</TT
> <B
CLASS="command"
>tar cvpf work-20030122.tar -g snapshot-20030121 work/</B
>
work/
work/file3
work/file4
</PRE
></FONT
></TD
></TR
></TABLE
><P
>These are some very simple examples, but you could also use this kind of command in a cronjob (see <A
HREF="sect_04_04.html#sect_04_04_04"
>Section 4.4.4</A
>), which specifies for instance a snapshot file for the weekly backup and one for the daily backup. Snapshot files should be replaced when taking full backups, in that case.</P
><P
>More information can be found in the <B
CLASS="command"
>tar</B
> documentation.</P
><DIV
CLASS="tip"
><P
></P
><TABLE
CLASS="tip"
WIDTH="100%"
BORDER="0"
><TR
><TD
WIDTH="25"
ALIGN="CENTER"
VALIGN="TOP"
><IMG
SRC="../images/tip.gif"
HSPACE="5"
ALT="Tip"></TD
><TH
ALIGN="LEFT"
VALIGN="CENTER"
><B
>The real stuff</B
></TH
></TR
><TR
><TD
>&nbsp;</TD
><TD
ALIGN="LEFT"
VALIGN="TOP"
><P
>As you could probably notice, <B
CLASS="command"
>tar</B
> is OK when we are talking about a simple directory, a set of files that belongs together. There are tools that are easier to manage, however, when you want to archive entire partitions or disks or larger projects. We just explain about <B
CLASS="command"
>tar</B
> here because it is a very popular tool for distributing archives. It will happen quite often that you need to install a software that comes in a so-called <SPAN
CLASS="QUOTE"
>"compressed tarball"</SPAN
>. See <A
HREF="sect_09_03.html"
>Section 9.3</A
> for an easier way to perform regular backups.</P
></TD
></TR
></TABLE
></DIV
></DIV
><DIV
CLASS="sect3"
><H3
CLASS="sect3"
><A
NAME="sect_09_01_01_03"
></A
>9.1.1.3. Compressing and unpacking with <B
CLASS="command"
>gzip</B
> or <B
CLASS="command"
>bzip2</B
></H3
><P
>&#13;Data, including tarballs, can be compressed using zip tools. The <B
CLASS="command"
>gzip</B
> command will add the suffix .gz to the file name and remove the original file.
</P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><FONT
COLOR="#000000"
><PRE
CLASS="screen"
>&#13;<TT
CLASS="prompt"
>jimmy:~&#62;</TT
> <B
CLASS="command"
>ls -la | grep tar</B
>
-rw-rw-r-- 1 jimmy jimmy 61440 Jun 6 14:08 images-without-dir.tar
<TT
CLASS="prompt"
>jimmy:~&#62;</TT
> <B
CLASS="command"
>gzip images-without-dir.tar</B
>
<TT
CLASS="prompt"
>jimmy:~&#62;</TT
> <B
CLASS="command"
>ls -la images-without-dir.tar.gz </B
>
-rw-rw-r-- 1 jimmy jimmy 50562 Jun 6 14:08 images-without-dir.tar.gz
</PRE
></FONT
></TD
></TR
></TABLE
><P
>Uncompress gzipped files with the <TT
CLASS="option"
>-d</TT
> option.</P
><P
><B
CLASS="command"
>bzip2</B
> works in a similar way, but uses an improved compression algorithm, thus creating smaller files. See the <B
CLASS="command"
>bzip2</B
> info pages for more.</P
><P
>Linux software packages are often distributed in a gzipped tarball. The sensible thing to do after unpacking that kind of archives is find the <TT
CLASS="filename"
>README</TT
> and read it. It will generally contain guidelines to installing the package.</P
><P
>The GNU <B
CLASS="command"
>tar</B
> command is aware of gzipped files. Use the command</P
><P
><B
CLASS="command"
>tar <TT
CLASS="option"
>zxvf</TT
> <TT
CLASS="filename"
>file.tar.gz</TT
></B
> </P
><P
>for unzipping and untarring <TT
CLASS="filename"
>.tar.gz</TT
> or <TT
CLASS="filename"
>.tgz</TT
> files. Use</P
><P
><B
CLASS="command"
>tar <TT
CLASS="option"
>jxvf</TT
> <TT
CLASS="filename"
>file.tar.bz2</TT
></B
> </P
><P
>for unpacking <B
CLASS="command"
>tar</B
> archives that were compressed with <B
CLASS="command"
>bzip2</B
>.</P
></DIV
><DIV
CLASS="sect3"
><H3
CLASS="sect3"
><A
NAME="sect_09_01_01_04"
></A
>9.1.1.4. Java archives</H3
><P
>The GNU project provides us with the <B
CLASS="command"
>jar</B
> tool for creating Java archives. It is a Java application that combines multiple files into a single JAR archive file. While also being a general purpose archiving and compression tool, based on ZIP and the ZLIB compression format, <B
CLASS="command"
>jar</B
> was mainly designed to facilitate the packing of Java code, applets and/or applications in a single file. When combined in a single archive, the components of a Java application, can be downloaded much faster.</P
><P
>Unlike <B
CLASS="command"
>tar</B
>, <B
CLASS="command"
>jar</B
> compresses by default, independent from other tools - because it is basically the Java version of <B
CLASS="command"
>zip</B
>. In addition, it allows individual entries in an archive to be signed by the author, so that origins can be authenticated.</P
><P
>The syntax is almost identical as for the <B
CLASS="command"
>tar</B
> command, we refer to <B
CLASS="command"
>info <TT
CLASS="parameter"
><I
>jar</I
></TT
></B
> for specific differences.</P
><DIV
CLASS="note"
><P
></P
><TABLE
CLASS="note"
WIDTH="100%"
BORDER="0"
><TR
><TD
WIDTH="25"
ALIGN="CENTER"
VALIGN="TOP"
><IMG
SRC="../images/note.gif"
HSPACE="5"
ALT="Note"></TD
><TH
ALIGN="LEFT"
VALIGN="CENTER"
><B
>tar, jar and symbolic links</B
></TH
></TR
><TR
><TD
>&nbsp;</TD
><TD
ALIGN="LEFT"
VALIGN="TOP"
><P
>One noteworthy feature not really mentioned in the standard documentation is that <B
CLASS="command"
>jar</B
> will follow symbolic links. Data to which these links are pointing will be included in the archive. The default in <B
CLASS="command"
>tar</B
> is to only backup the symbolic link, but this behavior can be changed using the <TT
CLASS="option"
>-h</TT
> to <B
CLASS="command"
>tar</B
>.</P
></TD
></TR
></TABLE
></DIV
></DIV
><DIV
CLASS="sect3"
><H3
CLASS="sect3"
><A
NAME="sect_09_01_01_05"
></A
>9.1.1.5. Transporting your data</H3
><P
>Saving copies of your data on another host is a simple but accurate way of making backups. See <A
HREF="chap_10.html"
>Chapter 10</A
> for more information on <B
CLASS="command"
>scp</B
>, <B
CLASS="command"
>ftp</B
> and more.</P
><P
>In the next section we'll discuss local backup devices.</P
></DIV
></DIV
></DIV
><DIV
CLASS="NAVFOOTER"
><HR
ALIGN="LEFT"
WIDTH="100%"><TABLE
SUMMARY="Footer navigation table"
WIDTH="100%"
BORDER="0"
CELLPADDING="0"
CELLSPACING="0"
><TR
><TD
WIDTH="33%"
ALIGN="left"
VALIGN="top"
><A
HREF="chap_09.html"
ACCESSKEY="P"
>Prev</A
></TD
><TD
WIDTH="34%"
ALIGN="center"
VALIGN="top"
><A
HREF="index.html"
ACCESSKEY="H"
>Home</A
></TD
><TD
WIDTH="33%"
ALIGN="right"
VALIGN="top"
><A
HREF="sect_09_02.html"
ACCESSKEY="N"
>Next</A
></TD
></TR
><TR
><TD
WIDTH="33%"
ALIGN="left"
VALIGN="top"
>Fundamental Backup Techniques</TD
><TD
WIDTH="34%"
ALIGN="center"
VALIGN="top"
><A
HREF="chap_09.html"
ACCESSKEY="U"
>Up</A
></TD
><TD
WIDTH="33%"
ALIGN="right"
VALIGN="top"
>Moving your data to a backup device</TD
></TR
></TABLE
></DIV
></BODY
></HTML
>