old-www/LDP/GNU-Linux-Tools-Summary/html/wildcards.html

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML
><HEAD
><TITLE
>Wildcards</TITLE
><META
NAME="GENERATOR"
CONTENT="Modular DocBook HTML Stylesheet Version 1.7"><LINK
REL="HOME"
TITLE="GNU/Linux Command-Line Tools Summary"
HREF="index.html"><LINK
REL="UP"
TITLE="Mini-Guides"
HREF="mini-guides.html"><LINK
REL="PREVIOUS"
TITLE="Duplicating disks"
HREF="duplicating-disks.html"><LINK
REL="NEXT"
TITLE="Appendix"
HREF="a12264.html"></HEAD
><BODY
CLASS="SECT1"
BGCOLOR="#FFFFFF"
TEXT="#000000"
LINK="#0000FF"
VLINK="#840084"
ALINK="#0000FF"
><DIV
CLASS="NAVHEADER"
><TABLE
SUMMARY="Header navigation table"
WIDTH="100%"
BORDER="0"
CELLPADDING="0"
CELLSPACING="0"
><TR
><TH
COLSPAN="3"
ALIGN="center"
>GNU/Linux Command-Line Tools Summary</TH
></TR
><TR
><TD
WIDTH="10%"
ALIGN="left"
VALIGN="bottom"
><A
HREF="duplicating-disks.html"
ACCESSKEY="P"
>Prev</A
></TD
><TD
WIDTH="80%"
ALIGN="center"
VALIGN="bottom"
>Chapter 20. Mini-Guides</TD
><TD
WIDTH="10%"
ALIGN="right"
VALIGN="bottom"
><A
HREF="a12264.html"
ACCESSKEY="N"
>Next</A
></TD
></TR
></TABLE
><HR
ALIGN="LEFT"
WIDTH="100%"></DIV
><DIV
CLASS="SECT1"
><H1
CLASS="SECT1"
><A
NAME="WILDCARDS"
></A
>20.4. Wildcards</H1
><P
>Wildcards are useful in many ways for a GNU/Linux system and for various other uses. Commands can use wildcards to perform actions on more than one file at a time, or to find part of a phrase in a text file. There are many uses for wildcards, there are two different major ways that wildcards are used, they are globbing patterns/standard wildcards that are often used by the shell. The alternative is regular expressions, popular with many other commands and popular for use with text searching and manipulation.</P
><DIV
CLASS="TIP"
><P
></P
><TABLE
CLASS="TIP"
WIDTH="100%"
BORDER="0"
><TR
><TD
WIDTH="25"
ALIGN="CENTER"
VALIGN="TOP"
><IMG
SRC="../images/tip.gif"
HSPACE="5"
ALT="Tip"></TD
><TH
ALIGN="LEFT"
VALIGN="CENTER"
><B
>Tip</B
></TH
></TR
><TR
><TD
>&nbsp;</TD
><TD
ALIGN="LEFT"
VALIGN="TOP"
><P
>
  If you have a file with wildcard expressions in it then you can use single quotes to stop bash expanding them or use backslashes (escape characters), or both. </P
><P
>For example if you wanted to create a file called 'fo*' (fo and asterisk) you would have to do it like this (note that you shouldn't create files with names like this, this is just an example):
  <TABLE
BORDER="1"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><FONT
COLOR="#000000"
><PRE
CLASS="SCREEN"
>  touch 'fo*'
  </PRE
></FONT
></TD
></TR
></TABLE
>
  </P
></TD
></TR
></TABLE
></DIV
><P
>Note that parts of both subsections on wildcards are based (at least in part) off the grep manual and info pages. Please see the <A
HREF="references.html"
><I
>Bibliography</I
></A
> for further information.</P
><DIV
CLASS="SECT2"
><H2
CLASS="SECT2"
><A
NAME="STANDARD-WILDCARDS"
></A
>20.4.1. Standard Wildcards (globbing patterns)</H2
><P
>Standard wildcards (also known as globbing patterns) are used by various command-line utilities to work with multiple files. For more information on standard wildcards (globbing patterns) refer to the manual page by typing:</P
><TABLE
BORDER="1"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><FONT
COLOR="#000000"
><PRE
CLASS="SCREEN"
>man 7 glob</PRE
></FONT
></TD
></TR
></TABLE
><DIV
CLASS="NOTE"
><P
></P
><TABLE
CLASS="NOTE"
WIDTH="100%"
BORDER="0"
><TR
><TD
WIDTH="25"
ALIGN="CENTER"
VALIGN="TOP"
><IMG
SRC="../images/note.gif"
HSPACE="5"
ALT="Note"></TD
><TH
ALIGN="LEFT"
VALIGN="CENTER"
><B
>Can be used by</B
></TH
></TR
><TR
><TD
>&nbsp;</TD
><TD
ALIGN="LEFT"
VALIGN="TOP"
><P
>
  Standard wildcards are used by nearly any command (including mv, cp, rm and many others).
  </P
></TD
></TR
></TABLE
></DIV
><P
></P
><DIV
CLASS="VARIABLELIST"
><DL
><DT
>?<3F>(question<6F>mark)</DT
><DD
><P
>this can represent any <EM
>single </EM
>character. If you specified something at the command line like "hd?" GNU/Linux would look for hda, hdb, hdc and every other letter/number between a-z, 0-9.</P
></DD
><DT
>*<2A>(asterisk)</DT
><DD
><P
>this can represent any number of characters (including zero, in other words, zero or more characters). If you specified a "cd*" it would use "cda", "cdrom", "cdrecord" and <EM
>anything</EM
> that starts with &#8220;cd&#8221; also including &#8220;cd&#8221; itself. "m*l" could by mill, mull, ml, and anything that starts with an m and ends with an l.</P
></DD
><DT
>[<5B>]<5D>(square<72>brackets)</DT
><DD
><P
>specifies a range. If you did m[a,o,u]m it can become: mam, mum, mom if you did: m[a-d]m it can become anything that starts and ends with m and has any character a to d inbetween. For example, these would work: mam, mbm, mcm, mdm. This kind of wildcard specifies an &#8220;or&#8221; relationship (you only need one to match).</P
></DD
><DT
>{<7B>}<7D>(curly<6C>brackets)</DT
><DD
><P
>terms are separated by commas and each term must be the name of something or a wildcard. This wildcard will copy anything that matches either wildcard(s), or exact name(s) (an &#8220;or&#8221; relationship, one or the other).</P
><P
>For example, this would be valid:</P
><TABLE
BORDER="1"
BGCOLOR="#E0E0E0"
WIDTH="90%"
><TR
><TD
><FONT
COLOR="#000000"
><PRE
CLASS="SCREEN"
>cp {*.doc,*.pdf} ~</PRE
></FONT
></TD
></TR
></TABLE
><P
>This will copy anything ending with .doc or .pdf to the users home directory. Note that spaces are not allowed after the commas (or anywhere else).</P
></DD
><DT
>[!]</DT
><DD
><P
>This construct is similar to the [<5B>] construct, except rather than matching any characters inside the brackets, it'll match any character, as long as it is not listed between the [ and ]. This is a logical NOT. For example<EM
> rm myfile[!9]</EM
> will remove all myfiles* (ie. myfiles1, myfiles2 etc) but won't remove a file with the number 9 anywhere within it's name.</P
></DD
><DT
>\<5C>(backslash)</DT
><DD
><P
>is used as an "escape" character, i.e. to protect a subsequent special character. Thus, "\\&#8221; searches for a backslash. Note you may need to use quotation marks and backslash(es).</P
></DD
></DL
></DIV
></DIV
><DIV
CLASS="SECT2"
><H2
CLASS="SECT2"
><A
NAME="REGULAR-EXPRESSIONS"
></A
>20.4.2. Regular Expressions</H2
><P
>Regular expressions are a type of globbing pattern used when working with text. They are used for any form of manipulation of multiple parts of text and by various programming languages that work with text. For more information on regular expressions refer to the manual page or try an online tutorial, for example IBM Developerworks <A
HREF="https://www6.software.ibm.com/developerworks/education/l-regexp/index.html"
TARGET="_top"
>using regular expressions</A
>. For the manual page type:</P
><P
>Type:</P
><TABLE
BORDER="1"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><FONT
COLOR="#000000"
><PRE
CLASS="SCREEN"
>man 7 regex</PRE
></FONT
></TD
></TR
></TABLE
><DIV
CLASS="NOTE"
><P
></P
><TABLE
CLASS="NOTE"
WIDTH="100%"
BORDER="0"
><TR
><TD
WIDTH="25"
ALIGN="CENTER"
VALIGN="TOP"
><IMG
SRC="../images/note.gif"
HSPACE="5"
ALT="Note"></TD
><TH
ALIGN="LEFT"
VALIGN="CENTER"
><B
>Regular expressions can be used by</B
></TH
></TR
><TR
><TD
>&nbsp;</TD
><TD
ALIGN="LEFT"
VALIGN="TOP"
><P
>
  Regular Expressions are used by <EM
>grep</EM
> (and can be used) by <EM
>find </EM
>and many other programs.
  </P
></TD
></TR
></TABLE
></DIV
><DIV
CLASS="TIP"
><P
></P
><TABLE
CLASS="TIP"
WIDTH="100%"
BORDER="0"
><TR
><TD
WIDTH="25"
ALIGN="CENTER"
VALIGN="TOP"
><IMG
SRC="../images/tip.gif"
HSPACE="5"
ALT="Tip"></TD
><TH
ALIGN="LEFT"
VALIGN="CENTER"
><B
>Tip</B
></TH
></TR
><TR
><TD
>&nbsp;</TD
><TD
ALIGN="LEFT"
VALIGN="TOP"
><P
>If your regular expressions don't seem to be working then you probably need to use single quotation marks over the sentence and then use backslashes on every single special character.</P
></TD
></TR
></TABLE
></DIV
><P
></P
><DIV
CLASS="VARIABLELIST"
><DL
><DT
>.<2E>(dot)</DT
><DD
><P
>will match <EM
>any single character</EM
>, equivalent to ? (question mark) in standard wildcard expressions. Thus, "m.a" matches "mpa" and "mea" but not "ma" or "mppa". </P
></DD
><DT
>\<5C>(backslash)</DT
><DD
><P
>is used as an "escape" character, i.e. to protect a subsequent special character. Thus, "\\" searches for a backslash. Note you may need to use quotation marks and backslash(es).</P
></DD
><DT
>.*<2A>(dot<6F>and<6E>asterisk)</DT
><DD
><P
>is used to match any string, equivalent to * in standard wildcards.</P
></DD
><DT
>*<2A>(asterisk)</DT
><DD
><P
>the proceeding item is to be matched <EM
> zero or more</EM
> times. ie. n* will match n, nn, nnnn, nnnnnnn but not na or any other character.</P
></DD
><DT
>^<5E>(caret)</DT
><DD
><P
>means "the beginning of the line". So "^a" means find a line starting with an "a".</P
></DD
><DT
>$<24>(dollar<61>sign)</DT
><DD
><P
>means "the end of the line". So "a$" means find a line ending with an "a".</P
><P
>For example, this command searches the file myfile for lines starting with an "s" and ending with an "n", and prints them to the standard output (screen):</P
><TABLE
BORDER="1"
BGCOLOR="#E0E0E0"
WIDTH="90%"
><TR
><TD
><FONT
COLOR="#000000"
><PRE
CLASS="SCREEN"
>cat myfile | grep '^s.*n$'</PRE
></FONT
></TD
></TR
></TABLE
></DD
><DT
>[<5B>]<5D>(square<72>brackets)</DT
><DD
><P
>specifies a range. If you did m[a,o,u]m it can become: mam, mum, mom if you did: m[a-d]m it can become anything that starts and ends with m and has any character a to d inbetween. For example, these would work: mam, mbm, mcm, mdm. This kind of wildcard specifies an &#8220;or&#8221; relationship (you only need one to match).</P
></DD
><DT
>|</DT
><DD
><P
>This wildcard makes a logical OR relationship between wildcards. This way you can search for something or something else (possibly using two different regular expressions). You may need to add a '\' (backslash) before this command to work, because the shell may attempt to interpret this as a pipe.</P
></DD
><DT
>[^]</DT
><DD
><P
>This is the equivalent of [!] in standard wildcards. This performs a logical &#8220;not&#8221;. This will match anything that is not listed within those square brackets. For example,<EM
> rm myfile[^9]</EM
> will remove all myfiles* (ie. myfiles1, myfiles2 etc) but won't remove a file with the number 9 anywhere within it's name.</P
></DD
></DL
></DIV
></DIV
><DIV
CLASS="SECT2"
><H2
CLASS="SECT2"
><A
NAME="USEFUL-POSIX-CHARACTERS"
></A
>20.4.3. Useful<75>categories<65>of<6F>characters<72>(as<61>defined<65>by<62>the<68>POSIX<49>standard)</H2
><P
>This information has been taken from the grep info page with a tiny amount of editing, see [10] in the <A
HREF="references.html"
><I
>Bibliography</I
></A
> for further information.</P
><P
></P
><UL
><LI
><P
>[:upper:] uppercase letters</P
></LI
><LI
><P
>[:lower:] lowercase letters </P
></LI
><LI
><P
>[:alpha:] alphabetic (letters) meaning upper+lower (both uppercase and lowercase letters)</P
></LI
><LI
><P
>[:digit:] numbers in decimal, 0 to 9 </P
></LI
><LI
><P
>[:alnum:] alphanumeric meaning alpha+digits (any uppercase or lowercase letters or any decimal digits)</P
></LI
><LI
><P
>[:space:] whitespace meaning spaces, tabs, newlines and similar</P
></LI
><LI
><P
>[:graph:] graphically printable characters excluding space</P
></LI
><LI
><P
>[:print:] printable characters including space</P
></LI
><LI
><P
>[:punct:] punctuation characters meaning graphical characters minus alpha and digits</P
></LI
><LI
><P
>[:cntrl:] control characters meaning non-printable characters</P
></LI
><LI
><P
>[:xdigit:] characters that are hexadecimal digits. </P
></LI
></UL
><DIV
CLASS="NOTE"
><P
></P
><TABLE
CLASS="NOTE"
WIDTH="100%"
BORDER="0"
><TR
><TD
WIDTH="25"
ALIGN="CENTER"
VALIGN="TOP"
><IMG
SRC="../images/note.gif"
HSPACE="5"
ALT="Note"></TD
><TH
ALIGN="LEFT"
VALIGN="CENTER"
><B
>These are used with</B
></TH
></TR
><TR
><TD
>&nbsp;</TD
><TD
ALIGN="LEFT"
VALIGN="TOP"
><P
>
The above commands will work with most tools which work with text (for example: <EM
>tr</EM
>). </P
></TD
></TR
></TABLE
></DIV
><P
>For example (advanced example)<EM
>,</EM
> this command scans the output of the dir command, and prints lines containing a capital letter followed by a digit: </P
><TABLE
BORDER="1"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><FONT
COLOR="#000000"
><PRE
CLASS="SCREEN"
>ls -l | grep '[[:upper:]][[:digit:]]'</PRE
></FONT
></TD
></TR
></TABLE
><P
>The command greps for [upper_case_letter][any_digit], meaning any uppercase letter followed by any digit. If you remove the [<5B>] (square brackets) in the middle it would look for an uppercase letter or a digit, because it would become [upper_case_letter<65>any_digit]</P
></DIV
></DIV
><DIV
CLASS="NAVFOOTER"
><HR
ALIGN="LEFT"
WIDTH="100%"><TABLE
SUMMARY="Footer navigation table"
WIDTH="100%"
BORDER="0"
CELLPADDING="0"
CELLSPACING="0"
><TR
><TD
WIDTH="33%"
ALIGN="left"
VALIGN="top"
><A
HREF="duplicating-disks.html"
ACCESSKEY="P"
>Prev</A
></TD
><TD
WIDTH="34%"
ALIGN="center"
VALIGN="top"
><A
HREF="index.html"
ACCESSKEY="H"
>Home</A
></TD
><TD
WIDTH="33%"
ALIGN="right"
VALIGN="top"
><A
HREF="a12264.html"
ACCESSKEY="N"
>Next</A
></TD
></TR
><TR
><TD
WIDTH="33%"
ALIGN="left"
VALIGN="top"
>Duplicating disks</TD
><TD
WIDTH="34%"
ALIGN="center"
VALIGN="top"
><A
HREF="mini-guides.html"
ACCESSKEY="U"
>Up</A
></TD
><TD
WIDTH="33%"
ALIGN="right"
VALIGN="top"
>Appendix</TD
></TR
></TABLE
></DIV
></BODY
></HTML
>