2004-11-03 13:51:07 +00:00
|
|
|
.\" Copyright (c) 2001-2003 The Open Group, All Rights Reserved
|
2007-06-20 22:33:04 +00:00
|
|
|
.TH "CUT" 1P 2003 "IEEE/The Open Group" "POSIX Programmer's Manual"
|
2004-11-03 13:51:07 +00:00
|
|
|
.\" cut
|
2007-09-20 06:03:25 +00:00
|
|
|
.SH PROLOG
|
|
|
|
This manual page is part of the POSIX Programmer's Manual.
|
|
|
|
The Linux implementation of this interface may differ (consult
|
|
|
|
the corresponding Linux manual page for details of Linux behavior),
|
|
|
|
or the interface may not be implemented on Linux.
|
2004-11-03 13:51:07 +00:00
|
|
|
.SH NAME
|
|
|
|
cut \- cut out selected fields of each line of a file
|
|
|
|
.SH SYNOPSIS
|
|
|
|
.LP
|
|
|
|
\fBcut -b\fP \fIlist\fP \fB[\fP\fB-n\fP\fB] [\fP\fIfile\fP \fB...\fP\fB]\fP\fB
|
|
|
|
.br
|
|
|
|
.sp
|
|
|
|
cut -c\fP \fIlist\fP \fB[\fP\fIfile\fP \fB...\fP\fB]\fP\fB
|
|
|
|
.br
|
|
|
|
.sp
|
|
|
|
cut -f\fP \fIlist\fP \fB[\fP\fB-d\fP \fIdelim\fP\fB][\fP\fB-s\fP\fB][\fP\fIfile\fP
|
|
|
|
\fB\&...\fP\fB]\fP\fB
|
|
|
|
.br
|
|
|
|
\fP
|
|
|
|
.SH DESCRIPTION
|
|
|
|
.LP
|
|
|
|
The \fIcut\fP utility shall cut out bytes ( \fB-b\fP option), characters
|
|
|
|
( \fB-c\fP option), or character-delimited fields (
|
|
|
|
\fB-f\fP option) from each line in one or more files, concatenate
|
|
|
|
them, and write them to standard output.
|
|
|
|
.SH OPTIONS
|
|
|
|
.LP
|
|
|
|
The \fIcut\fP utility shall conform to the Base Definitions volume
|
|
|
|
of IEEE\ Std\ 1003.1-2001, Section 12.2, Utility Syntax Guidelines.
|
|
|
|
.LP
|
|
|
|
The application shall ensure that the option-argument \fIlist\fP (see
|
|
|
|
options \fB-b\fP, \fB-c\fP, and \fB-f\fP below) is a
|
|
|
|
comma-separated list or <blank>-separated list of positive numbers
|
|
|
|
and ranges. Ranges can be in three forms. The first is two
|
|
|
|
positive numbers separated by a hyphen ( \fIlow\fP- \fIhigh\fP), which
|
|
|
|
represents all fields from the first number to the second
|
|
|
|
number. The second is a positive number preceded by a hyphen (- \fIhigh\fP),
|
|
|
|
which represents all fields from field number 1 to
|
|
|
|
that number. The third is a positive number followed by a hyphen (
|
|
|
|
\fIlow\fP-), which represents that number to the last field,
|
|
|
|
inclusive. The elements in \fIlist\fP can be repeated, can overlap,
|
|
|
|
and can be specified in any order, but the bytes, characters,
|
|
|
|
or fields selected shall be written in the order of the input data.
|
|
|
|
If an element appears in the selection list more than once, it
|
|
|
|
shall be written exactly once.
|
|
|
|
.LP
|
|
|
|
The following options shall be supported:
|
|
|
|
.TP 7
|
|
|
|
\fB-b\ \fP \fIlist\fP
|
|
|
|
Cut based on a \fIlist\fP of bytes. Each selected byte shall be output
|
|
|
|
unless the \fB-n\fP option is also specified. It shall
|
|
|
|
not be an error to select bytes not present in the input line.
|
|
|
|
.TP 7
|
|
|
|
\fB-c\ \fP \fIlist\fP
|
|
|
|
Cut based on a \fIlist\fP of characters. Each selected character shall
|
|
|
|
be output. It shall not be an error to select
|
|
|
|
characters not present in the input line.
|
|
|
|
.TP 7
|
|
|
|
\fB-d\ \fP \fIdelim\fP
|
|
|
|
Set the field delimiter to the character \fIdelim\fP. The default
|
|
|
|
is the <tab>.
|
|
|
|
.TP 7
|
|
|
|
\fB-f\ \fP \fIlist\fP
|
|
|
|
Cut based on a \fIlist\fP of fields, assumed to be separated in the
|
|
|
|
file by a delimiter character (see \fB-d\fP). Each
|
|
|
|
selected field shall be output. Output fields shall be separated by
|
|
|
|
a single occurrence of the field delimiter character. Lines
|
|
|
|
with no field delimiters shall be passed through intact, unless \fB-s\fP
|
|
|
|
is specified. It shall not be an error to select fields
|
|
|
|
not present in the input line.
|
|
|
|
.TP 7
|
|
|
|
\fB-n\fP
|
|
|
|
Do not split characters. When specified with the \fB-b\fP option,
|
|
|
|
each element in \fIlist\fP of the form \fIlow\fP-
|
|
|
|
\fIhigh\fP (hyphen-separated numbers) shall be modified as follows:
|
|
|
|
.RS
|
|
|
|
.IP " *" 3
|
|
|
|
If the byte selected by \fIlow\fP is not the first byte of a character,
|
|
|
|
\fIlow\fP shall be decremented to select the first
|
|
|
|
byte of the character originally selected by \fIlow\fP. If the byte
|
|
|
|
selected by \fIhigh\fP is not the last byte of a character,
|
|
|
|
\fIhigh\fP shall be decremented to select the last byte of the character
|
|
|
|
prior to the character originally selected by
|
|
|
|
\fIhigh\fP, or zero if there is no prior character. If the resulting
|
|
|
|
range element has \fIhigh\fP equal to zero or \fIlow\fP
|
|
|
|
greater than \fIhigh\fP, the list element shall be dropped from \fIlist\fP
|
|
|
|
for that input line without causing an error.
|
|
|
|
.LP
|
|
|
|
.RE
|
|
|
|
.LP
|
|
|
|
Each element in \fIlist\fP of the form \fIlow\fP- shall be treated
|
|
|
|
as above with \fIhigh\fP set to the number of bytes in the
|
|
|
|
current line, not including the terminating <newline>. Each element
|
|
|
|
in \fIlist\fP of the form - \fIhigh\fP shall be treated
|
|
|
|
as above with \fIlow\fP set to 1. Each element in \fIlist\fP of the
|
|
|
|
form \fInum\fP (a single number) shall be treated as above
|
|
|
|
with \fIlow\fP set to \fInum\fP and \fIhigh\fP set to \fInum\fP.
|
|
|
|
.TP 7
|
|
|
|
\fB-s\fP
|
|
|
|
Suppress lines with no delimiter characters, when used with the \fB-f\fP
|
|
|
|
option. Unless specified, lines with no delimiters
|
|
|
|
shall be passed through untouched.
|
|
|
|
.sp
|
|
|
|
.SH OPERANDS
|
|
|
|
.LP
|
|
|
|
The following operand shall be supported:
|
|
|
|
.TP 7
|
|
|
|
\fIfile\fP
|
|
|
|
A pathname of an input file. If no \fIfile\fP operands are specified,
|
2007-09-20 06:36:16 +00:00
|
|
|
or if a \fIfile\fP operand is \fB'-'\fP, the
|
2004-11-03 13:51:07 +00:00
|
|
|
standard input shall be used.
|
|
|
|
.sp
|
|
|
|
.SH STDIN
|
|
|
|
.LP
|
|
|
|
The standard input shall be used only if no \fIfile\fP operands are
|
|
|
|
specified, or if a \fIfile\fP operand is \fB'-'\fP .
|
|
|
|
See the INPUT FILES section.
|
|
|
|
.SH INPUT FILES
|
|
|
|
.LP
|
|
|
|
The input files shall be text files, except that line lengths shall
|
|
|
|
be unlimited.
|
|
|
|
.SH ENVIRONMENT VARIABLES
|
|
|
|
.LP
|
|
|
|
The following environment variables shall affect the execution of
|
|
|
|
\fIcut\fP:
|
|
|
|
.TP 7
|
|
|
|
\fILANG\fP
|
|
|
|
Provide a default value for the internationalization variables that
|
|
|
|
are unset or null. (See the Base Definitions volume of
|
|
|
|
IEEE\ Std\ 1003.1-2001, Section 8.2, Internationalization Variables
|
|
|
|
for
|
|
|
|
the precedence of internationalization variables used to determine
|
|
|
|
the values of locale categories.)
|
|
|
|
.TP 7
|
|
|
|
\fILC_ALL\fP
|
|
|
|
If set to a non-empty string value, override the values of all the
|
|
|
|
other internationalization variables.
|
|
|
|
.TP 7
|
|
|
|
\fILC_CTYPE\fP
|
|
|
|
Determine the locale for the interpretation of sequences of bytes
|
|
|
|
of text data as characters (for example, single-byte as
|
|
|
|
opposed to multi-byte characters in arguments and input files).
|
|
|
|
.TP 7
|
|
|
|
\fILC_MESSAGES\fP
|
|
|
|
Determine the locale that should be used to affect the format and
|
|
|
|
contents of diagnostic messages written to standard
|
|
|
|
error.
|
|
|
|
.TP 7
|
|
|
|
\fINLSPATH\fP
|
|
|
|
Determine the location of message catalogs for the processing of \fILC_MESSAGES
|
|
|
|
\&.\fP
|
|
|
|
.sp
|
|
|
|
.SH ASYNCHRONOUS EVENTS
|
|
|
|
.LP
|
|
|
|
Default.
|
|
|
|
.SH STDOUT
|
|
|
|
.LP
|
|
|
|
The \fIcut\fP utility output shall be a concatenation of the selected
|
|
|
|
bytes, characters, or fields (one of the following):
|
|
|
|
.sp
|
|
|
|
.RS
|
|
|
|
.nf
|
|
|
|
|
|
|
|
\fB"%s\\n", <\fP\fIconcatenation of bytes\fP\fB>
|
|
|
|
.sp
|
|
|
|
|
|
|
|
"%s\\n", <\fP\fIconcatenation of characters\fP\fB>
|
|
|
|
.sp
|
|
|
|
|
|
|
|
"%s\\n", <\fP\fIconcatenation of fields and field delimiters\fP\fB>
|
|
|
|
\fP
|
|
|
|
.fi
|
|
|
|
.RE
|
|
|
|
.SH STDERR
|
|
|
|
.LP
|
|
|
|
The standard error shall be used only for diagnostic messages.
|
|
|
|
.SH OUTPUT FILES
|
|
|
|
.LP
|
|
|
|
None.
|
|
|
|
.SH EXTENDED DESCRIPTION
|
|
|
|
.LP
|
|
|
|
None.
|
|
|
|
.SH EXIT STATUS
|
|
|
|
.LP
|
|
|
|
The following exit values shall be returned:
|
|
|
|
.TP 7
|
|
|
|
\ 0
|
|
|
|
All input files were output successfully.
|
|
|
|
.TP 7
|
|
|
|
>0
|
|
|
|
An error occurred.
|
|
|
|
.sp
|
|
|
|
.SH CONSEQUENCES OF ERRORS
|
|
|
|
.LP
|
|
|
|
Default.
|
|
|
|
.LP
|
|
|
|
\fIThe following sections are informative.\fP
|
|
|
|
.SH APPLICATION USAGE
|
|
|
|
.LP
|
|
|
|
Earlier versions of the \fIcut\fP utility worked in an environment
|
|
|
|
where bytes and characters were considered equivalent
|
|
|
|
(modulo <backspace> and <tab> processing in some implementations).
|
|
|
|
In the extended world of multi-byte characters, the
|
|
|
|
new \fB-b\fP option has been added. The \fB-n\fP option (used with
|
|
|
|
\fB-b\fP) allows it to be used to act on bytes rounded to
|
|
|
|
character boundaries. The algorithm specified for \fB-n\fP guarantees
|
|
|
|
that:
|
|
|
|
.sp
|
|
|
|
.RS
|
|
|
|
.nf
|
|
|
|
|
|
|
|
\fBcut -b 1-500 -n file > file1
|
|
|
|
cut -b 501- -n file > file2
|
|
|
|
\fP
|
|
|
|
.fi
|
|
|
|
.RE
|
|
|
|
.LP
|
|
|
|
ends up with all the characters in \fBfile\fP appearing exactly once
|
|
|
|
in \fBfile1\fP or \fBfile2\fP. (There is, however, a
|
|
|
|
<newline> in both \fBfile1\fP and \fBfile2\fP for each <newline> in
|
|
|
|
\fBfile\fP.)
|
|
|
|
.SH EXAMPLES
|
|
|
|
.LP
|
|
|
|
Examples of the option qualifier list:
|
|
|
|
.TP 7
|
|
|
|
1,4,7
|
|
|
|
Select the first, fourth, and seventh bytes, characters, or fields
|
|
|
|
and field delimiters.
|
|
|
|
.TP 7
|
|
|
|
1-3,8
|
|
|
|
Equivalent to 1,2,3,8.
|
|
|
|
.TP 7
|
|
|
|
-5,10
|
|
|
|
Equivalent to 1,2,3,4,5,10.
|
|
|
|
.TP 7
|
|
|
|
3-
|
|
|
|
Equivalent to third to last, inclusive.
|
|
|
|
.sp
|
|
|
|
.LP
|
|
|
|
The \fIlow\fP- \fIhigh\fP forms are not always equivalent when used
|
|
|
|
with \fB-b\fP and \fB-n\fP and multi-byte characters;
|
|
|
|
see the description of \fB-n\fP.
|
|
|
|
.LP
|
|
|
|
The following command:
|
|
|
|
.sp
|
|
|
|
.RS
|
|
|
|
.nf
|
|
|
|
|
|
|
|
\fBcut -d : -f 1,6 /etc/passwd
|
|
|
|
\fP
|
|
|
|
.fi
|
|
|
|
.RE
|
|
|
|
.LP
|
|
|
|
reads the System V password file (user database) and produces lines
|
|
|
|
of the form:
|
|
|
|
.sp
|
|
|
|
.RS
|
|
|
|
.nf
|
|
|
|
|
|
|
|
\fB<\fP\fIuser ID\fP\fB>:<\fP\fIhome directory\fP\fB>
|
|
|
|
\fP
|
|
|
|
.fi
|
|
|
|
.RE
|
|
|
|
.LP
|
|
|
|
Most utilities in this volume of IEEE\ Std\ 1003.1-2001 work on text
|
|
|
|
files. The \fIcut\fP utility can be used to turn
|
|
|
|
files with arbitrary line lengths into a set of text files containing
|
|
|
|
the same data. The \fIpaste\fP utility can be used to create (or recreate)
|
|
|
|
files with arbitrary line lengths. For
|
|
|
|
example, if \fBfile\fP contains long lines:
|
|
|
|
.sp
|
|
|
|
.RS
|
|
|
|
.nf
|
|
|
|
|
|
|
|
\fBcut -b 1-500 -n file > file1
|
|
|
|
cut -b 501- -n file > file2
|
|
|
|
\fP
|
|
|
|
.fi
|
|
|
|
.RE
|
|
|
|
.LP
|
|
|
|
creates \fBfile1\fP (a text file) with lines no longer than 500 bytes
|
|
|
|
(plus the <newline>) and \fBfile2\fP that contains
|
|
|
|
the remainder of the data from \fBfile\fP. (Note that \fBfile2\fP
|
|
|
|
is not a text file if there are lines in \fBfile\fP that are
|
|
|
|
longer than 500 + {LINE_MAX} bytes.) The original file can be recreated
|
|
|
|
from \fBfile1\fP and \fBfile2\fP using the command:
|
|
|
|
.sp
|
|
|
|
.RS
|
|
|
|
.nf
|
|
|
|
|
|
|
|
\fBpaste -d "\\0" file1 file2 > file
|
|
|
|
\fP
|
|
|
|
.fi
|
|
|
|
.RE
|
|
|
|
.SH RATIONALE
|
|
|
|
.LP
|
|
|
|
Some historical implementations do not count <backspace>s in determining
|
|
|
|
character counts with the \fB-c\fP option. This
|
|
|
|
may be useful for using \fIcut\fP for processing \fInroff\fP output.
|
|
|
|
It was deliberately decided not to have the \fB-c\fP option
|
|
|
|
treat either <backspace>s or <tab>s in any special fashion. The \fIfold\fP
|
|
|
|
utility does treat these characters specially.
|
|
|
|
.LP
|
|
|
|
Unlike other utilities, some historical implementations of \fIcut\fP
|
|
|
|
exit after not finding an input file, rather than
|
|
|
|
continuing to process the remaining \fIfile\fP operands. This behavior
|
|
|
|
is prohibited by this volume of
|
|
|
|
IEEE\ Std\ 1003.1-2001, where only the exit status is affected by
|
|
|
|
this problem.
|
|
|
|
.LP
|
|
|
|
The behavior of \fIcut\fP when provided with either mutually-exclusive
|
|
|
|
options or options that do not work logically together
|
|
|
|
has been deliberately left unspecified in favor of global wording
|
|
|
|
in \fIUtility Description
|
|
|
|
Defaults\fP .
|
|
|
|
.LP
|
|
|
|
The OPTIONS section was changed in response to IEEE PASC Interpretation
|
|
|
|
1003.2 #149. The change represents historical practice
|
|
|
|
on all known systems. The original standard was ambiguous on the nature
|
|
|
|
of the output.
|
|
|
|
.LP
|
|
|
|
The \fIlist\fP option-arguments are historically used to select the
|
|
|
|
portions of the line to be written, but do not affect the
|
|
|
|
order of the data. For example:
|
|
|
|
.sp
|
|
|
|
.RS
|
|
|
|
.nf
|
|
|
|
|
|
|
|
\fBecho abcdefghi | cut -c6,2,4-7,1
|
|
|
|
\fP
|
|
|
|
.fi
|
|
|
|
.RE
|
|
|
|
.LP
|
|
|
|
yields \fB"abdefg"\fP .
|
|
|
|
.LP
|
|
|
|
A proposal to enhance \fIcut\fP with the following option:
|
|
|
|
.TP 7
|
|
|
|
\fB-o\fP
|
|
|
|
Preserve the selected field order. When this option is specified,
|
|
|
|
each byte, character, or field (or ranges of such) shall be
|
|
|
|
written in the order specified by the \fIlist\fP option-argument,
|
|
|
|
even if this requires multiple outputs of the same bytes,
|
|
|
|
characters, or fields.
|
|
|
|
.sp
|
|
|
|
.LP
|
|
|
|
was rejected because this type of enhancement is outside the scope
|
|
|
|
of the IEEE\ P1003.2b draft standard.
|
|
|
|
.SH FUTURE DIRECTIONS
|
|
|
|
.LP
|
|
|
|
None.
|
|
|
|
.SH SEE ALSO
|
|
|
|
.LP
|
2007-09-20 06:36:16 +00:00
|
|
|
\fIgrep\fP, \fIpaste\fP, \fIParameters
|
2004-11-03 13:51:07 +00:00
|
|
|
and Variables\fP
|
|
|
|
.SH COPYRIGHT
|
|
|
|
Portions of this text are reprinted and reproduced in electronic form
|
|
|
|
from IEEE Std 1003.1, 2003 Edition, Standard for Information Technology
|
|
|
|
-- Portable Operating System Interface (POSIX), The Open Group Base
|
|
|
|
Specifications Issue 6, Copyright (C) 2001-2003 by the Institute of
|
|
|
|
Electrical and Electronics Engineers, Inc and The Open Group. In the
|
|
|
|
event of any discrepancy between this version and the original IEEE and
|
|
|
|
The Open Group Standard, the original IEEE and The Open Group Standard
|
|
|
|
is the referee document. The original Standard can be obtained online at
|
|
|
|
http://www.opengroup.org/unix/online.html .
|