man-pages/man1p/diff.1p

647 lines
19 KiB
Groff

.\" Copyright (c) 2001-2003 The Open Group, All Rights Reserved
.TH "DIFF" P 2003 "IEEE/The Open Group" "POSIX Programmer's Manual"
.\" diff
.SH NAME
diff \- compare two files
.SH SYNOPSIS
.LP
\fBdiff\fP \fB[\fP\fB-c| -e| -f| -C\fP \fIn\fP\fB][\fP\fB-br\fP\fB]\fP
\fIfile1 file2\fP
.SH DESCRIPTION
.LP
The \fIdiff\fP utility shall compare the contents of \fIfile1\fP and
\fIfile2\fP and write to standard output a list of
changes necessary to convert \fIfile1\fP into \fIfile2\fP. This list
should be minimal. No output shall be produced if the files
are identical.
.SH OPTIONS
.LP
The \fIdiff\fP utility shall conform to the Base Definitions volume
of IEEE\ Std\ 1003.1-2001, Section 12.2, Utility Syntax Guidelines.
.LP
The following options shall be supported:
.TP 7
\fB-b\fP
Cause any amount of white space at the end of a line to be treated
as a single <newline> (that is, the white-space
characters preceding the <newline> are ignored) and other strings
of white-space characters, not including <newline>s,
to compare equal.
.TP 7
\fB-c\fP
Produce output in a form that provides three lines of context.
.TP 7
\fB-C\ n\fP
Produce output in a form that provides \fIn\fP lines of context (where
\fIn\fP shall be interpreted as a positive decimal
integer).
.TP 7
\fB-e\fP
Produce output in a form suitable as input for the \fIed\fP utility,
which can then be used
to convert \fIfile1\fP into \fIfile2\fP.
.TP 7
\fB-f\fP
Produce output in an alternative form, similar in format to \fB-e\fP,
but not intended to be suitable as input for the \fIed\fP utility,
and in the opposite order.
.TP 7
\fB-r\fP
Apply \fIdiff\fP recursively to files and directories of the same
name when \fIfile1\fP and \fIfile2\fP are both
directories.
.sp
.SH OPERANDS
.LP
The following operands shall be supported:
.TP 7
\fIfile1\fP,\ \fIfile2\fP
A pathname of a file to be compared. If either the \fIfile1\fP or
\fIfile2\fP operand is \fB'-'\fP , the standard input
shall be used in its place.
.sp
.LP
If both \fIfile1\fP and \fIfile2\fP are directories, \fIdiff\fP shall
not compare block special files, character special
files, or FIFO special files to any files and shall not compare regular
files to directories. Further details are as specified in
Diff Directory Comparison Format . The behavior of \fIdiff\fP on other
file types is
implementation-defined when found in directories.
.LP
If only one of \fIfile1\fP and \fIfile2\fP is a directory, \fIdiff\fP
shall be applied to the non-directory file and the file
contained in the directory file with a filename that is the same as
the last component of the non-directory file.
.SH STDIN
.LP
The standard input shall be used only if one of the \fIfile1\fP or
\fIfile2\fP operands references standard input. See the
INPUT FILES section.
.SH INPUT FILES
.LP
The input files may be of any type.
.SH ENVIRONMENT VARIABLES
.LP
The following environment variables shall affect the execution of
\fIdiff\fP:
.TP 7
\fILANG\fP
Provide a default value for the internationalization variables that
are unset or null. (See the Base Definitions volume of
IEEE\ Std\ 1003.1-2001, Section 8.2, Internationalization Variables
for
the precedence of internationalization variables used to determine
the values of locale categories.)
.TP 7
\fILC_ALL\fP
If set to a non-empty string value, override the values of all the
other internationalization variables.
.TP 7
\fILC_CTYPE\fP
Determine the locale for the interpretation of sequences of bytes
of text data as characters (for example, single-byte as
opposed to multi-byte characters in arguments and input files).
.TP 7
\fILC_MESSAGES\fP
Determine the locale that should be used to affect the format and
contents of diagnostic messages written to standard error and
informative messages written to standard output.
.TP 7
\fILC_TIME\fP
Determine the locale for affecting the format of file timestamps written
with the \fB-C\fP and \fB-c\fP options.
.TP 7
\fINLSPATH\fP
Determine the location of message catalogs for the processing of \fILC_MESSAGES
\&.\fP
.TP 7
\fITZ\fP
Determine the timezone used for calculating file timestamps written
with the \fB-C\fP and \fB-c\fP options. If \fITZ\fP is
unset or null, an unspecified default timezone shall be used.
.sp
.SH ASYNCHRONOUS EVENTS
.LP
Default.
.SH STDOUT
.SS Diff Directory Comparison Format
.LP
If both \fIfile1\fP and \fIfile2\fP are directories, the following
output formats shall be used.
.LP
In the POSIX locale, each file that is present in only one directory
shall be reported using the following format:
.sp
.RS
.nf
\fB"Only in %s: %s\\n", <\fP\fIdirectory pathname\fP\fB>, <\fP\fIfilename\fP\fB>
\fP
.fi
.RE
.LP
In the POSIX locale, subdirectories that are common to the two directories
may be reported with the following format:
.sp
.RS
.nf
\fB"Common subdirectories: %s and %s\\n", <\fP\fIdirectory1 pathname\fP\fB>,
<\fP\fIdirectory2 pathname\fP\fB>
\fP
.fi
.RE
.LP
For each file common to the two directories if the two files are not
to be compared, the following format shall be used in the
POSIX locale:
.sp
.RS
.nf
\fB"File %s is a %s while file %s is a %s\\n", <\fP\fIdirectory1 pathname\fP\fB>,
<\fP\fIfile type of directory1 pathname\fP\fB>, <\fP\fIdirectory2 pathname\fP\fB>,
<\fP\fIfile type of directory2 pathname\fP\fB>
\fP
.fi
.RE
.LP
For each file common to the two directories, if the files are compared
and are identical, no output shall be written. If the two
files differ, the following format is written:
.sp
.RS
.nf
\fB"diff %s %s %s\\n", <\fP\fIdiff_options\fP\fB>, <\fP\fIfilename1\fP\fB>, <\fP\fIfilename2\fP\fB>
\fP
.fi
.RE
.LP
where <\fIdiff_options\fP> are the options as specified on the command
line.
.LP
All directory pathnames listed in this section shall be relative to
the original command line arguments. All other names of
files listed in this section shall be filenames (pathname components).
.SS Diff Binary Output Format
.LP
In the POSIX locale, if one or both of the files being compared are
not text files, an unspecified format shall be used that
contains the pathnames of two files being compared and the string
\fB"differ"\fP .
.LP
If both files being compared are text files, depending on the options
specified, one of the following formats shall be used to
write the differences.
.SS Diff Default Output Format
.LP
The default (without \fB-e\fP, \fB-f\fP, \fB-c\fP, or \fB-C\fP options)
\fIdiff\fP utility output shall contain lines of
these forms:
.sp
.RS
.nf
\fB"%da%d\\n", <\fP\fInum1\fP\fB>, <\fP\fInum2\fP\fB>
.sp
"%da%d,%d\\n", <\fP\fInum1\fP\fB>, <\fP\fInum2\fP\fB>, <\fP\fInum3\fP\fB>
.sp
"%dd%d\\n", <\fP\fInum1\fP\fB>, <\fP\fInum2\fP\fB>
.sp
"%d,%dd%d\\n", <\fP\fInum1\fP\fB>, <\fP\fInum2\fP\fB>, <\fP\fInum3\fP\fB>
.sp
"%dc%d\\n", <\fP\fInum1\fP\fB>, <\fP\fInum2\fP\fB>
.sp
"%d,%dc%d\\n", <\fP\fInum1\fP\fB>, <\fP\fInum2\fP\fB>, <\fP\fInum3\fP\fB>
.sp
"%dc%d,%d\\n", <\fP\fInum1\fP\fB>, <\fP\fInum2\fP\fB>, <\fP\fInum3\fP\fB>
.sp
"%d,%dc%d,%d\\n", <\fP\fInum1\fP\fB>, <\fP\fInum2\fP\fB>, <\fP\fInum3\fP\fB>, <\fP\fInum4\fP\fB>
\fP
.fi
.RE
.LP
These lines resemble \fIed\fP subcommands to convert \fIfile1\fP into
\fIfile2\fP. The
line numbers before the action letters shall pertain to \fIfile1\fP;
those after shall pertain to \fIfile2\fP. Thus, by
exchanging \fIa\fP for \fId\fP and reading the line in reverse order,
one can also determine how to convert \fIfile2\fP into
\fIfile1\fP. As in \fIed\fP, identical pairs (where \fInum1\fP= \fInum2\fP)
are abbreviated
as a single number.
.LP
Following each of these lines, \fIdiff\fP shall write to standard
output all lines affected in the first file using the
format:
.sp
.RS
.nf
\fB"< %s", <\fP\fIline\fP\fB>
\fP
.fi
.RE
.LP
and all lines affected in the second file using the format:
.sp
.RS
.nf
\fB"> %s", <\fP\fIline\fP\fB>
\fP
.fi
.RE
.LP
If there are lines affected in both \fIfile1\fP and \fIfile2\fP (as
with the \fBc\fP subcommand), the changes are separated
with a line consisting of three hyphens:
.sp
.RS
.nf
\fB"---\\n"
\fP
.fi
.RE
.SS Diff -e Output Format
.LP
With the \fB-e\fP option, a script shall be produced that shall, when
provided as input to \fIed\fP, along with an appended \fBw\fP (write)
command, convert \fIfile1\fP into \fIfile2\fP. Only
the \fBa\fP (append), \fBc\fP (change), \fBd\fP (delete), \fBi\fP
(insert), and \fBs\fP (substitute) commands of \fIed\fP shall be used
in this script. Text lines, except those consisting of the single
character
period ( \fB'.'\fP ), shall be output as they appear in the file.
.SS Diff -f Output Format
.LP
With the \fB-f\fP option, an alternative format of script shall be
produced. It is similar to that produced by \fB-e\fP, with
the following differences:
.IP " 1." 4
It is expressed in reverse sequence; the output of \fB-e\fP orders
changes from the end of the file to the beginning; the
\fB-f\fP from beginning to end.
.LP
.IP " 2." 4
The command form <\fIlines\fP> <\fIcommand-letter\fP> used by \fB-e\fP
is reversed. For example, 10\fIc\fP with
\fB-e\fP would be \fIc\fP10 with \fB-f\fP.
.LP
.IP " 3." 4
The form used for ranges of line numbers is <space>-separated, rather
than comma-separated.
.LP
.SS Diff -c or -C Output Format
.LP
With the \fB-c\fP or \fB-C\fP option, the output format shall consist
of affected lines along with surrounding lines of
context. The affected lines shall show which ones need to be deleted
or changed in \fIfile1\fP, and those added from \fIfile2\fP.
With the \fB-c\fP option, three lines of context, if available, shall
be written before and after the affected lines. With the
\fB-C\fP option, the user can specify how many lines of context are
written. The exact format follows.
.LP
The name and last modification time of each file shall be output in
the following format:
.sp
.RS
.nf
\fB"*** %s %s\\n",\fP \fIfile1\fP\fB, <\fP\fIfile1 timestamp\fP\fB>
"--- %s %s\\n",\fP \fIfile2\fP\fB, <\fP\fIfile2 timestamp\fP\fB>
\fP
.fi
.RE
.LP
Each <\fIfile\fP> field shall be the pathname of the corresponding
file being compared. The pathname written for standard
input is unspecified.
.LP
In the POSIX locale, each <\fItimestamp\fP> field shall be equivalent
to the output from the following command:
.sp
.RS
.nf
\fBdate "+%a %b %e %T %Y"
\fP
.fi
.RE
.LP
without the trailing <newline>, executed at the time of last modification
of the corresponding file (or the current time,
if the file is standard input).
.LP
Then, the following output formats shall be applied for every set
of changes.
.LP
First, a line shall be written in the following format:
.sp
.RS
.nf
\fB"***************\\n"
\fP
.fi
.RE
.LP
Next, the range of lines in \fIfile1\fP shall be written in the following
format if the range contains two or more lines:
.sp
.RS
.nf
\fB"*** %d,%d ****\\n", <\fP\fIbeginning line number\fP\fB>, <\fP\fIending line number\fP\fB>
\fP
.fi
.RE
.LP
and the following format otherwise:
.sp
.RS
.nf
\fB"*** %d ****\\n", <\fP\fIending line number\fP\fB>
\fP
.fi
.RE
.LP
The ending line number of an empty range shall be the number of the
preceding line, or 0 if the range is at the start of the
file.
.LP
Next, the affected lines along with lines of context (unaffected lines)
shall be written. Unaffected lines shall be written in
the following format:
.sp
.RS
.nf
\fB" %s", <\fP\fIunaffected_line\fP\fB>
\fP
.fi
.RE
.LP
Deleted lines shall be written as:
.sp
.RS
.nf
\fB"- %s", <\fP\fIdeleted_line\fP\fB>
\fP
.fi
.RE
.LP
Changed lines shall be written as:
.sp
.RS
.nf
\fB"! %s", <\fP\fIchanged_line\fP\fB>
\fP
.fi
.RE
.LP
Next, the range of lines in \fIfile2\fP shall be written in the following
format if the range contains two or more lines:
.sp
.RS
.nf
\fB"--- %d,%d ----\\n", <\fP\fIbeginning line number\fP\fB>, <\fP\fIending line number\fP\fB>
\fP
.fi
.RE
.LP
and the following format otherwise:
.sp
.RS
.nf
\fB"--- %d ----\\n", <\fP\fIending line number\fP\fB>
\fP
.fi
.RE
.LP
Then, lines of context and changed lines shall be written as described
in the previous formats. Lines added from \fIfile2\fP
shall be written in the following format:
.sp
.RS
.nf
\fB"+ %s", <\fP\fIadded_line\fP\fB>
\fP
.fi
.RE
.SH STDERR
.LP
The standard error shall be used only for diagnostic messages.
.SH OUTPUT FILES
.LP
None.
.SH EXTENDED DESCRIPTION
.LP
None.
.SH EXIT STATUS
.LP
The following exit values shall be returned:
.TP 7
\ 0
No differences were found.
.TP 7
\ 1
Differences were found.
.TP 7
>1
An error occurred.
.sp
.SH CONSEQUENCES OF ERRORS
.LP
Default.
.LP
\fIThe following sections are informative.\fP
.SH APPLICATION USAGE
.LP
If lines at the end of a file are changed and other lines are added,
\fIdiff\fP output may show this as a delete and add, as a
change, or as a change and add; \fIdiff\fP is not expected to know
which happened and users should not care about the difference
in output as long as it clearly shows the differences between the
files.
.SH EXAMPLES
.LP
If \fBdir1\fP is a directory containing a directory named \fBx\fP,
\fBdir2\fP is a directory containing a directory named
\fBx\fP, \fBdir1/x\fP and \fBdir2/x\fP both contain files named \fBdate.out\fP,
and \fBdir2/x\fP contains a file named
\fBy\fP, the command:
.sp
.RS
.nf
\fBdiff -r dir1 dir2
\fP
.fi
.RE
.LP
could produce output similar to:
.sp
.RS
.nf
\fBCommon subdirectories: dir1/x and dir2/x
Only in dir2/x: y
diff -r dir1/x/date.out dir2/x/date.out
1c1
< Mon Jul 2 13:12:16 PDT 1990
---
> Tue Jun 19 21:41:39 PDT 1990
\fP
.fi
.RE
.SH RATIONALE
.LP
The \fB-h\fP option was omitted because it was insufficiently specified
and does not add to applications portability.
.LP
Historical implementations employ algorithms that do not always produce
a minimum list of differences; the current language
about making every effort is the best this volume of IEEE\ Std\ 1003.1-2001
can do, as there is no metric that could be
employed to judge the quality of implementations against any and all
file contents. The statement "This list should be minimal''
clearly implies that implementations are not expected to provide the
following output when comparing two 100-line files that differ
in only one character on a single line:
.sp
.RS
.nf
\fB1,100c1,100
all 100 lines from file1 preceded with "< "
---
all 100 lines from file2 preceded with "> "
\fP
.fi
.RE
.LP
The "Only in" messages required when the \fB-r\fP option is specified
are not used by most historical implementations if the
\fB-e\fP option is also specified. It is required here because it
provides useful information that must be provided to update a
target directory hierarchy to match a source hierarchy. The "Common
subdirectories" messages are written by System V and 4.3 BSD
when the \fB-r\fP option is specified. They are allowed here but are
not required because they are reporting on something that is
the same, not reporting a difference, and are not needed to update
a target hierarchy.
.LP
The \fB-c\fP option, which writes output in a format using lines of
context, has been included. The format is useful for a
variety of reasons, among them being much improved readability and
the ability to understand difference changes when the target
file has line numbers that differ from another similar, but slightly
different, copy. The \fIpatch\fP utility is most valuable when working
with difference listings using the context format.
The BSD version of \fB-c\fP takes an optional argument specifying
the amount of context. Rather than overloading \fB-c\fP and
breaking the Utility Syntax Guidelines for \fIdiff\fP, the standard
developers decided to add a separate option for specifying a
context diff with a specified amount of context ( \fB-C\fP). Also,
the format for context \fIdiff\fPs was extended slightly in
4.3 BSD to allow multiple changes that are within context lines from
each other to be merged together. The output format contains
an additional four asterisks after the range of affected lines in
the first filename. This was to provide a flag for old programs
(like old versions of \fIpatch\fP) that only understand the old context
format. The version
of context described here does not require that multiple changes within
context lines be merged, but it does not prohibit it
either. The extension is upwards-compatible, so any vendors that wish
to retain the old version of \fIdiff\fP can do so by adding
the extra four asterisks (that is, utilities that currently use \fIdiff\fP
and understand the new merged format will also
understand the old unmerged format, but not \fIvice versa\fP).
.LP
The substitute command was added as an additional format for the \fB-e\fP
option. This was added to provide implementations
with a way to fix the classic "dot alone on a line" bug present in
many versions of \fIdiff\fP. Since many implementations have
fixed this bug, the standard developers decided not to standardize
broken behavior, but rather to provide the necessary tool for
fixing the bug. One way to fix this bug is to output two periods whenever
a lone period is needed, then terminate the append
command with a period, and then use the substitute command to convert
the two periods into one period.
.LP
The BSD-derived \fB-r\fP option was added to provide a mechanism for
using \fIdiff\fP to compare two file system trees. This
behavior is useful, is standard practice on all BSD-derived systems,
and is not easily reproducible with the \fIfind\fP utility.
.LP
The requirement that \fIdiff\fP not compare files in some circumstances,
even though they have the same name, is based on the
actual output of historical implementations. The message specified
here is already in use when a directory is being compared to a
non-directory. It is extended here to preclude the problems arising
from running into FIFOs and other files that would cause
\fIdiff\fP to hang waiting for input with no indication to the user
that \fIdiff\fP was hung. In most common usage, \fIdiff\fP
\fB-r\fP should indicate differences in the file hierarchies, not
the difference of contents of devices pointed to by the
hierarchies.
.LP
Many early implementations of \fIdiff\fP require seekable files. Since
the System Interfaces volume of
IEEE\ Std\ 1003.1-2001 supports named pipes, the standard developers
decided that such a restriction was unreasonable. Note
also that the allowed filename \fB-\fP almost always refers to a pipe.
.LP
No directory search order is specified for \fIdiff\fP. The historical
ordering is, in fact, not optimal, in that it prints out
all of the differences at the current level, including the statements
about all common subdirectories before recursing into those
subdirectories.
.LP
The message:
.sp
.RS
.nf
\fB"diff %s %s %s\\n", <\fP\fIdiff_options\fP\fB>, <\fP\fIfilename1\fP\fB>, <\fP\fIfilename2\fP\fB>
\fP
.fi
.RE
.LP
does not vary by locale because it is the representation of a command,
not an English sentence.
.SH FUTURE DIRECTIONS
.LP
None.
.SH SEE ALSO
.LP
\fIcmp\fP , \fIcomm\fP , \fIed\fP , \fIfind\fP
.SH COPYRIGHT
Portions of this text are reprinted and reproduced in electronic form
from IEEE Std 1003.1, 2003 Edition, Standard for Information Technology
-- Portable Operating System Interface (POSIX), The Open Group Base
Specifications Issue 6, Copyright (C) 2001-2003 by the Institute of
Electrical and Electronics Engineers, Inc and The Open Group. In the
event of any discrepancy between this version and the original IEEE and
The Open Group Standard, the original IEEE and The Open Group Standard
is the referee document. The original Standard can be obtained online at
http://www.opengroup.org/unix/online.html .