2004-11-03 13:51:07 +00:00
|
|
|
|
.\" Copyright (c) 1998 Andries Brouwer
|
|
|
|
|
.\"
|
|
|
|
|
.\" This is free documentation; you can redistribute it and/or
|
|
|
|
|
.\" modify it under the terms of the GNU General Public License as
|
|
|
|
|
.\" published by the Free Software Foundation; either version 2 of
|
|
|
|
|
.\" the License, or (at your option) any later version.
|
|
|
|
|
.\"
|
|
|
|
|
.\" The GNU General Public License's references to "object code"
|
|
|
|
|
.\" and "executables" are to be interpreted as the output of any
|
|
|
|
|
.\" document formatting or typesetting system, including
|
|
|
|
|
.\" intermediate and printed output.
|
|
|
|
|
.\"
|
|
|
|
|
.\" This manual is distributed in the hope that it will be useful,
|
|
|
|
|
.\" but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
|
|
|
.\" MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
|
|
|
.\" GNU General Public License for more details.
|
|
|
|
|
.\"
|
|
|
|
|
.\" You should have received a copy of the GNU General Public
|
|
|
|
|
.\" License along with this manual; if not, write to the Free
|
|
|
|
|
.\" Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111,
|
|
|
|
|
.\" USA.
|
|
|
|
|
.\"
|
|
|
|
|
.\" 2003-08-24 fix for / by John Kristoff + joey
|
|
|
|
|
.\"
|
|
|
|
|
.TH GLOB 7 2003-08-24 "Unix" "Linux Programmer's Manual"
|
|
|
|
|
.SH NAME
|
|
|
|
|
glob \- Globbing pathnames
|
|
|
|
|
.SH DESCRIPTION
|
|
|
|
|
Long ago, in Unix V6, there was a program
|
|
|
|
|
.I /etc/glob
|
|
|
|
|
that would expand wildcard patterns.
|
|
|
|
|
Soon afterwards this became a shell built-in.
|
|
|
|
|
|
|
|
|
|
These days there is also a library routine
|
|
|
|
|
.BR glob (3)
|
|
|
|
|
that will perform this function for a user program.
|
|
|
|
|
|
2006-08-03 13:58:08 +00:00
|
|
|
|
The rules are as follows (POSIX.2, 3.13).
|
2004-11-03 13:51:07 +00:00
|
|
|
|
.SH "WILDCARD MATCHING"
|
|
|
|
|
A string is a wildcard pattern if it contains one of the
|
|
|
|
|
characters `?', `*' or `['. Globbing is the operation
|
|
|
|
|
that expands a wildcard pattern into the list of pathnames
|
|
|
|
|
matching the pattern. Matching is defined by:
|
|
|
|
|
|
|
|
|
|
A `?' (not between brackets) matches any single character.
|
|
|
|
|
|
|
|
|
|
A `*' (not between brackets) matches any string,
|
|
|
|
|
including the empty string.
|
|
|
|
|
|
|
|
|
|
.SS "Character classes"
|
|
|
|
|
An expression `[...]' where the first character after the
|
|
|
|
|
leading `[' is not an `!' matches a single character,
|
|
|
|
|
namely any of the characters enclosed by the brackets.
|
|
|
|
|
The string enclosed by the brackets cannot be empty;
|
|
|
|
|
therefore `]' can be allowed between the brackets, provided
|
|
|
|
|
that it is the first character. (Thus, `[][!]' matches the
|
|
|
|
|
three characters `[', `]' and `!'.)
|
|
|
|
|
|
|
|
|
|
.SS Ranges
|
|
|
|
|
There is one special convention:
|
2005-07-06 07:41:37 +00:00
|
|
|
|
two characters separated by `\-' denote a range.
|
|
|
|
|
(Thus, `[A\-Fa\-f0\-9]' is equivalent to `[ABCDEFabcdef0123456789]'.)
|
|
|
|
|
One may include `\-' in its literal meaning by making it the
|
2004-11-03 13:51:07 +00:00
|
|
|
|
first or last character between the brackets.
|
2005-07-06 07:41:37 +00:00
|
|
|
|
(Thus, `[]\-]' matches just the two characters `]' and `\-',
|
|
|
|
|
and `[\-\-0]' matches the three characters `\-', `.', `0', since `/'
|
2004-11-03 13:51:07 +00:00
|
|
|
|
cannot be matched.)
|
|
|
|
|
|
|
|
|
|
.SS Complementation
|
|
|
|
|
An expression `[!...]' matches a single character, namely
|
|
|
|
|
any character that is not matched by the expression obtained
|
|
|
|
|
by removing the first `!' from it.
|
2005-07-06 07:41:37 +00:00
|
|
|
|
(Thus, `[!]a\-]' matches any single character except `]', `a' and `\-'.)
|
2004-11-03 13:51:07 +00:00
|
|
|
|
|
|
|
|
|
One can remove the special meaning of `?', `*' and `[' by
|
|
|
|
|
preceding them by a backslash, or, in case this is part of
|
|
|
|
|
a shell command line, enclosing them in quotes.
|
|
|
|
|
Between brackets these characters stand for themselves.
|
|
|
|
|
Thus, `[[?*\e]' matches the four characters `[', `?', `*' and `\e'.
|
|
|
|
|
|
|
|
|
|
.SH PATHNAMES
|
|
|
|
|
Globbing is applied on each of the components of a pathname
|
|
|
|
|
separately. A `/' in a pathname cannot be matched by a `?' or `*'
|
2005-07-06 07:41:37 +00:00
|
|
|
|
wildcard, or by a range like `[.\-0]'. A range cannot contain an
|
2004-11-03 13:51:07 +00:00
|
|
|
|
explicit `/' character; this would lead to a syntax error.
|
|
|
|
|
|
|
|
|
|
If a filename starts with a `.', this character must be matched explicitly.
|
|
|
|
|
(Thus, `rm *' will not remove .profile, and `tar c *' will not
|
|
|
|
|
archive all your files; `tar c .' is better.)
|
|
|
|
|
|
|
|
|
|
.SH "EMPTY LISTS"
|
|
|
|
|
The nice and simple rule given above: `expand a wildcard pattern
|
|
|
|
|
into the list of matching pathnames' was the original Unix
|
|
|
|
|
definition. It allowed one to have patterns that expand into
|
|
|
|
|
an empty list, as in
|
|
|
|
|
.br
|
|
|
|
|
.nf
|
2005-07-06 12:57:38 +00:00
|
|
|
|
xv \-wait 0 *.gif *.jpg
|
2004-11-03 13:51:07 +00:00
|
|
|
|
.fi
|
|
|
|
|
where perhaps no *.gif files are present (and this is not
|
|
|
|
|
an error).
|
|
|
|
|
However, POSIX requires that a wildcard pattern is left
|
|
|
|
|
unchanged when it is syntactically incorrect, or the list of
|
|
|
|
|
matching pathnames is empty.
|
|
|
|
|
With
|
|
|
|
|
.I bash
|
|
|
|
|
one can force the classical behaviour by setting
|
|
|
|
|
.IR allow_null_glob_expansion=true .
|
|
|
|
|
|
|
|
|
|
(Similar problems occur elsewhere. E.g., where old scripts have
|
|
|
|
|
.br
|
|
|
|
|
.nf
|
2005-07-06 12:57:38 +00:00
|
|
|
|
rm `find . \-name "*~"`
|
2004-11-03 13:51:07 +00:00
|
|
|
|
.fi
|
|
|
|
|
new scripts require
|
|
|
|
|
.br
|
|
|
|
|
.nf
|
2005-07-06 12:57:38 +00:00
|
|
|
|
rm \-f nosuchfile `find . \-name "*~"`
|
2004-11-03 13:51:07 +00:00
|
|
|
|
.fi
|
|
|
|
|
to avoid error messages from
|
|
|
|
|
.I rm
|
|
|
|
|
called with an empty argument list.)
|
|
|
|
|
|
|
|
|
|
.SH NOTES
|
|
|
|
|
.SS Regular expressions
|
|
|
|
|
Note that wildcard patterns are not regular expressions,
|
|
|
|
|
although they are a bit similar. First of all, they match
|
|
|
|
|
filenames, rather than text, and secondly, the conventions
|
|
|
|
|
are not the same: e.g., in a regular expression `*' means zero or
|
|
|
|
|
more copies of the preceding thing.
|
|
|
|
|
|
|
|
|
|
Now that regular expressions have bracket expressions where
|
|
|
|
|
the negation is indicated by a `^', POSIX has declared the
|
|
|
|
|
effect of a wildcard pattern `[^...]' to be undefined.
|
|
|
|
|
|
|
|
|
|
.SS Character classes and Internationalization
|
|
|
|
|
Of course ranges were originally meant to be ASCII ranges,
|
2005-07-06 07:41:37 +00:00
|
|
|
|
so that `[\ \-%]' stands for `[\ !"#$%]' and `[a\-z]' stands
|
2004-11-03 13:51:07 +00:00
|
|
|
|
for "any lowercase letter".
|
2005-07-06 07:41:37 +00:00
|
|
|
|
Some Unix implementations generalized this so that a range X\-Y
|
2004-11-03 13:51:07 +00:00
|
|
|
|
stands for the set of characters with code between the codes for
|
|
|
|
|
X and for Y. However, this requires the user to know the
|
|
|
|
|
character coding in use on the local system, and moreover, is
|
|
|
|
|
not convenient if the collating sequence for the local alphabet
|
|
|
|
|
differs from the ordering of the character codes.
|
|
|
|
|
Therefore, POSIX extended the bracket notation greatly,
|
|
|
|
|
both for wildcard patterns and for regular expressions.
|
|
|
|
|
In the above we saw three types of items that can occur in a bracket
|
|
|
|
|
expression: namely (i) the negation, (ii) explicit single characters,
|
|
|
|
|
and (iii) ranges. POSIX specifies ranges in an internationally
|
|
|
|
|
more useful way and adds three more types:
|
|
|
|
|
|
2005-07-06 07:41:37 +00:00
|
|
|
|
(iii) Ranges X\-Y comprise all characters that fall between X
|
2005-06-24 14:17:21 +00:00
|
|
|
|
and Y (inclusive) in the current collating sequence as defined
|
2004-11-03 13:51:07 +00:00
|
|
|
|
by the LC_COLLATE category in the current locale.
|
|
|
|
|
|
|
|
|
|
(iv) Named character classes, like
|
|
|
|
|
.br
|
|
|
|
|
.nf
|
|
|
|
|
[:alnum:] [:alpha:] [:blank:] [:cntrl:]
|
|
|
|
|
[:digit:] [:graph:] [:lower:] [:print:]
|
|
|
|
|
[:punct:] [:space:] [:upper:] [:xdigit:]
|
|
|
|
|
.fi
|
2005-07-06 07:41:37 +00:00
|
|
|
|
so that one can say `[[:lower:]]' instead of `[a\-z]', and have
|
2004-11-03 13:51:07 +00:00
|
|
|
|
things work in Denmark, too, where there are three letters past `z'
|
|
|
|
|
in the alphabet.
|
|
|
|
|
These character classes are defined by the LC_CTYPE category
|
|
|
|
|
in the current locale.
|
|
|
|
|
|
|
|
|
|
(v) Collating symbols, like `[.ch.]' or `[.a-acute.]',
|
|
|
|
|
where the string between `[.' and `.]' is a collating
|
|
|
|
|
element defined for the current locale. Note that this may
|
|
|
|
|
be a multi-character element.
|
|
|
|
|
|
|
|
|
|
(vi) Equivalence class expressions, like `[=a=]',
|
|
|
|
|
where the string between `[=' and `=]' is any collating
|
|
|
|
|
element from its equivalence class, as defined for the
|
|
|
|
|
current locale. For example, `[[=a=]]' might be equivalent
|
|
|
|
|
to `[a<><61><EFBFBD><EFBFBD>]' (warning: Latin-1 here), that is,
|
|
|
|
|
to `[a[.a-acute.][.a-grave.][.a-umlaut.][.a-circumflex.]]'.
|
|
|
|
|
|
|
|
|
|
.SH "SEE ALSO"
|
|
|
|
|
.BR sh (1),
|
|
|
|
|
.BR fnmatch (3),
|
|
|
|
|
.BR glob (3),
|
|
|
|
|
.BR locale (7),
|
|
|
|
|
.BR regex (7)
|