4448 lines
69 KiB
HTML
4448 lines
69 KiB
HTML
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
|
|
<HTML
|
|
><HEAD
|
|
><TITLE
|
|
>Text Processing Commands</TITLE
|
|
><META
|
|
NAME="GENERATOR"
|
|
CONTENT="Modular DocBook HTML Stylesheet Version 1.7"><LINK
|
|
REL="HOME"
|
|
TITLE="Advanced Bash-Scripting Guide"
|
|
HREF="index.html"><LINK
|
|
REL="UP"
|
|
TITLE="External Filters, Programs and Commands"
|
|
HREF="external.html"><LINK
|
|
REL="PREVIOUS"
|
|
TITLE="Time / Date Commands"
|
|
HREF="timedate.html"><LINK
|
|
REL="NEXT"
|
|
TITLE="File and Archiving Commands"
|
|
HREF="filearchiv.html"></HEAD
|
|
><BODY
|
|
CLASS="SECT1"
|
|
BGCOLOR="#FFFFFF"
|
|
TEXT="#000000"
|
|
LINK="#0000FF"
|
|
VLINK="#840084"
|
|
ALINK="#0000FF"
|
|
><DIV
|
|
CLASS="NAVHEADER"
|
|
><TABLE
|
|
SUMMARY="Header navigation table"
|
|
WIDTH="100%"
|
|
BORDER="0"
|
|
CELLPADDING="0"
|
|
CELLSPACING="0"
|
|
><TR
|
|
><TH
|
|
COLSPAN="3"
|
|
ALIGN="center"
|
|
>Advanced Bash-Scripting Guide: </TH
|
|
></TR
|
|
><TR
|
|
><TD
|
|
WIDTH="10%"
|
|
ALIGN="left"
|
|
VALIGN="bottom"
|
|
><A
|
|
HREF="timedate.html"
|
|
ACCESSKEY="P"
|
|
>Prev</A
|
|
></TD
|
|
><TD
|
|
WIDTH="80%"
|
|
ALIGN="center"
|
|
VALIGN="bottom"
|
|
>Chapter 16. External Filters, Programs and Commands</TD
|
|
><TD
|
|
WIDTH="10%"
|
|
ALIGN="right"
|
|
VALIGN="bottom"
|
|
><A
|
|
HREF="filearchiv.html"
|
|
ACCESSKEY="N"
|
|
>Next</A
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
><HR
|
|
ALIGN="LEFT"
|
|
WIDTH="100%"></DIV
|
|
><DIV
|
|
CLASS="SECT1"
|
|
><H1
|
|
CLASS="SECT1"
|
|
><A
|
|
NAME="TEXTPROC"
|
|
></A
|
|
>16.4. Text Processing Commands</H1
|
|
><P
|
|
></P
|
|
><DIV
|
|
CLASS="VARIABLELIST"
|
|
><P
|
|
><B
|
|
><A
|
|
NAME="TPCOMMANDLISTING1"
|
|
></A
|
|
>Commands affecting text and
|
|
text files</B
|
|
></P
|
|
><DL
|
|
><DT
|
|
><A
|
|
NAME="SORTREF"
|
|
></A
|
|
><B
|
|
CLASS="COMMAND"
|
|
>sort</B
|
|
></DT
|
|
><DD
|
|
><P
|
|
>File sort utility, often used as a filter in a pipe. This
|
|
command sorts a <I
|
|
CLASS="FIRSTTERM"
|
|
>text stream</I
|
|
>
|
|
or file forwards or backwards, or according to various
|
|
keys or character positions. Using the <TT
|
|
CLASS="OPTION"
|
|
>-m</TT
|
|
>
|
|
option, it merges presorted input files. The <I
|
|
CLASS="FIRSTTERM"
|
|
>info
|
|
page</I
|
|
> lists its many capabilities and options. See
|
|
<A
|
|
HREF="loops1.html#FINDSTRING"
|
|
>Example 11-10</A
|
|
>, <A
|
|
HREF="loops1.html#SYMLINKS"
|
|
>Example 11-11</A
|
|
>,
|
|
and <A
|
|
HREF="contributed-scripts.html#MAKEDICT"
|
|
>Example A-8</A
|
|
>.</P
|
|
></DD
|
|
><DT
|
|
><A
|
|
NAME="TSORTREF"
|
|
></A
|
|
><B
|
|
CLASS="COMMAND"
|
|
>tsort</B
|
|
></DT
|
|
><DD
|
|
><P
|
|
><I
|
|
CLASS="FIRSTTERM"
|
|
>Topological sort</I
|
|
>, reading in
|
|
pairs of whitespace-separated strings and sorting
|
|
according to input patterns. The original purpose of
|
|
<B
|
|
CLASS="COMMAND"
|
|
>tsort</B
|
|
> was to sort a list of dependencies
|
|
for an obsolete version of the <I
|
|
CLASS="FIRSTTERM"
|
|
>ld</I
|
|
>
|
|
linker in an <SPAN
|
|
CLASS="QUOTE"
|
|
>"ancient"</SPAN
|
|
> version of UNIX.</P
|
|
><P
|
|
>The results of a <I
|
|
CLASS="FIRSTTERM"
|
|
>tsort</I
|
|
> will usually
|
|
differ markedly from those of the standard
|
|
<B
|
|
CLASS="COMMAND"
|
|
>sort</B
|
|
> command, above.</P
|
|
></DD
|
|
><DT
|
|
><A
|
|
NAME="UNIQREF"
|
|
></A
|
|
><B
|
|
CLASS="COMMAND"
|
|
>uniq</B
|
|
></DT
|
|
><DD
|
|
><P
|
|
>This filter removes duplicate lines from a sorted
|
|
file. It is often seen in a pipe coupled with
|
|
<A
|
|
HREF="textproc.html#SORTREF"
|
|
>sort</A
|
|
>.</P
|
|
><P
|
|
><TABLE
|
|
BORDER="0"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="PROGRAMLISTING"
|
|
>cat list-1 list-2 list-3 | sort | uniq > final.list
|
|
# Concatenates the list files,
|
|
# sorts them,
|
|
# removes duplicate lines,
|
|
# and finally writes the result to an output file.</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
></P
|
|
><P
|
|
>The useful <TT
|
|
CLASS="OPTION"
|
|
>-c</TT
|
|
> option prefixes each line of
|
|
the input file with its number of occurrences.</P
|
|
><P
|
|
> <TABLE
|
|
BORDER="1"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="SCREEN"
|
|
><TT
|
|
CLASS="PROMPT"
|
|
>bash$ </TT
|
|
><TT
|
|
CLASS="USERINPUT"
|
|
><B
|
|
>cat testfile</B
|
|
></TT
|
|
>
|
|
<TT
|
|
CLASS="COMPUTEROUTPUT"
|
|
>This line occurs only once.
|
|
This line occurs twice.
|
|
This line occurs twice.
|
|
This line occurs three times.
|
|
This line occurs three times.
|
|
This line occurs three times.</TT
|
|
>
|
|
|
|
|
|
<TT
|
|
CLASS="PROMPT"
|
|
>bash$ </TT
|
|
><TT
|
|
CLASS="USERINPUT"
|
|
><B
|
|
>uniq -c testfile</B
|
|
></TT
|
|
>
|
|
<TT
|
|
CLASS="COMPUTEROUTPUT"
|
|
> 1 This line occurs only once.
|
|
2 This line occurs twice.
|
|
3 This line occurs three times.</TT
|
|
>
|
|
|
|
|
|
<TT
|
|
CLASS="PROMPT"
|
|
>bash$ </TT
|
|
><TT
|
|
CLASS="USERINPUT"
|
|
><B
|
|
>sort testfile | uniq -c | sort -nr</B
|
|
></TT
|
|
>
|
|
<TT
|
|
CLASS="COMPUTEROUTPUT"
|
|
> 3 This line occurs three times.
|
|
2 This line occurs twice.
|
|
1 This line occurs only once.</TT
|
|
>
|
|
</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
>
|
|
</P
|
|
><P
|
|
>The <TT
|
|
CLASS="USERINPUT"
|
|
><B
|
|
>sort INPUTFILE | uniq -c | sort -nr</B
|
|
></TT
|
|
>
|
|
command string produces a <I
|
|
CLASS="FIRSTTERM"
|
|
>frequency
|
|
of occurrence</I
|
|
> listing on the
|
|
<TT
|
|
CLASS="FILENAME"
|
|
>INPUTFILE</TT
|
|
> file (the
|
|
<TT
|
|
CLASS="OPTION"
|
|
>-nr</TT
|
|
> options to <B
|
|
CLASS="COMMAND"
|
|
>sort</B
|
|
>
|
|
cause a reverse numerical sort). This template finds
|
|
use in analysis of log files and dictionary lists, and
|
|
wherever the lexical structure of a document needs to
|
|
be examined.</P
|
|
><DIV
|
|
CLASS="EXAMPLE"
|
|
><A
|
|
NAME="WF"
|
|
></A
|
|
><P
|
|
><B
|
|
>Example 16-12. Word Frequency Analysis</B
|
|
></P
|
|
><TABLE
|
|
BORDER="0"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="PROGRAMLISTING"
|
|
>#!/bin/bash
|
|
# wf.sh: Crude word frequency analysis on a text file.
|
|
# This is a more efficient version of the "wf2.sh" script.
|
|
|
|
|
|
# Check for input file on command-line.
|
|
ARGS=1
|
|
E_BADARGS=85
|
|
E_NOFILE=86
|
|
|
|
if [ $# -ne "$ARGS" ] # Correct number of arguments passed to script?
|
|
then
|
|
echo "Usage: `basename $0` filename"
|
|
exit $E_BADARGS
|
|
fi
|
|
|
|
if [ ! -f "$1" ] # Check if file exists.
|
|
then
|
|
echo "File \"$1\" does not exist."
|
|
exit $E_NOFILE
|
|
fi
|
|
|
|
|
|
|
|
########################################################
|
|
# main ()
|
|
sed -e 's/\.//g' -e 's/\,//g' -e 's/ /\
|
|
/g' "$1" | tr 'A-Z' 'a-z' | sort | uniq -c | sort -nr
|
|
# =========================
|
|
# Frequency of occurrence
|
|
|
|
# Filter out periods and commas, and
|
|
#+ change space between words to linefeed,
|
|
#+ then shift characters to lowercase, and
|
|
#+ finally prefix occurrence count and sort numerically.
|
|
|
|
# Arun Giridhar suggests modifying the above to:
|
|
# . . . | sort | uniq -c | sort +1 [-f] | sort +0 -nr
|
|
# This adds a secondary sort key, so instances of
|
|
#+ equal occurrence are sorted alphabetically.
|
|
# As he explains it:
|
|
# "This is effectively a radix sort, first on the
|
|
#+ least significant column
|
|
#+ (word or string, optionally case-insensitive)
|
|
#+ and last on the most significant column (frequency)."
|
|
#
|
|
# As Frank Wang explains, the above is equivalent to
|
|
#+ . . . | sort | uniq -c | sort +0 -nr
|
|
#+ and the following also works:
|
|
#+ . . . | sort | uniq -c | sort -k1nr -k
|
|
########################################################
|
|
|
|
exit 0
|
|
|
|
# Exercises:
|
|
# ---------
|
|
# 1) Add 'sed' commands to filter out other punctuation,
|
|
#+ such as semicolons.
|
|
# 2) Modify the script to also filter out multiple spaces and
|
|
#+ other whitespace.</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
></DIV
|
|
><P
|
|
> <TABLE
|
|
BORDER="1"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="SCREEN"
|
|
><TT
|
|
CLASS="PROMPT"
|
|
>bash$ </TT
|
|
><TT
|
|
CLASS="USERINPUT"
|
|
><B
|
|
>cat testfile</B
|
|
></TT
|
|
>
|
|
<TT
|
|
CLASS="COMPUTEROUTPUT"
|
|
>This line occurs only once.
|
|
This line occurs twice.
|
|
This line occurs twice.
|
|
This line occurs three times.
|
|
This line occurs three times.
|
|
This line occurs three times.</TT
|
|
>
|
|
|
|
|
|
<TT
|
|
CLASS="PROMPT"
|
|
>bash$ </TT
|
|
><TT
|
|
CLASS="USERINPUT"
|
|
><B
|
|
>./wf.sh testfile</B
|
|
></TT
|
|
>
|
|
<TT
|
|
CLASS="COMPUTEROUTPUT"
|
|
> 6 this
|
|
6 occurs
|
|
6 line
|
|
3 times
|
|
3 three
|
|
2 twice
|
|
1 only
|
|
1 once</TT
|
|
>
|
|
</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
>
|
|
</P
|
|
></DD
|
|
><DT
|
|
><A
|
|
NAME="EXPANDREF"
|
|
></A
|
|
><B
|
|
CLASS="COMMAND"
|
|
>expand</B
|
|
>, <B
|
|
CLASS="COMMAND"
|
|
>unexpand</B
|
|
></DT
|
|
><DD
|
|
><P
|
|
>The <B
|
|
CLASS="COMMAND"
|
|
>expand</B
|
|
> filter converts tabs to
|
|
spaces. It is often used in a <A
|
|
HREF="special-chars.html#PIPEREF"
|
|
>pipe</A
|
|
>.</P
|
|
><P
|
|
>The <B
|
|
CLASS="COMMAND"
|
|
>unexpand</B
|
|
> filter
|
|
converts spaces to tabs. This reverses the effect of
|
|
<B
|
|
CLASS="COMMAND"
|
|
>expand</B
|
|
>.</P
|
|
></DD
|
|
><DT
|
|
><A
|
|
NAME="CUTREF"
|
|
></A
|
|
><B
|
|
CLASS="COMMAND"
|
|
>cut</B
|
|
></DT
|
|
><DD
|
|
><P
|
|
>A tool for extracting <A
|
|
HREF="special-chars.html#FIELDREF"
|
|
>fields</A
|
|
> from files. It is similar
|
|
to the <TT
|
|
CLASS="USERINPUT"
|
|
><B
|
|
>print $N</B
|
|
></TT
|
|
> command set in <A
|
|
HREF="awk.html#AWKREF"
|
|
>awk</A
|
|
>, but more limited. It may be
|
|
simpler to use <I
|
|
CLASS="FIRSTTERM"
|
|
>cut</I
|
|
> in a script than
|
|
<I
|
|
CLASS="FIRSTTERM"
|
|
>awk</I
|
|
>. Particularly important are the
|
|
<TT
|
|
CLASS="OPTION"
|
|
>-d</TT
|
|
> (delimiter) and <TT
|
|
CLASS="OPTION"
|
|
>-f</TT
|
|
>
|
|
(field specifier) options.</P
|
|
><P
|
|
>Using <B
|
|
CLASS="COMMAND"
|
|
>cut</B
|
|
> to obtain a listing of the
|
|
mounted filesystems:
|
|
<TABLE
|
|
BORDER="0"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="PROGRAMLISTING"
|
|
>cut -d ' ' -f1,2 /etc/mtab</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
></P
|
|
><P
|
|
>Using <B
|
|
CLASS="COMMAND"
|
|
>cut</B
|
|
> to list the OS and kernel version:
|
|
<TABLE
|
|
BORDER="0"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="PROGRAMLISTING"
|
|
>uname -a | cut -d" " -f1,3,11,12</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
></P
|
|
><P
|
|
>Using <B
|
|
CLASS="COMMAND"
|
|
>cut</B
|
|
> to extract message headers from
|
|
an e-mail folder:
|
|
|
|
<TABLE
|
|
BORDER="1"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="SCREEN"
|
|
><TT
|
|
CLASS="PROMPT"
|
|
>bash$ </TT
|
|
><TT
|
|
CLASS="USERINPUT"
|
|
><B
|
|
>grep '^Subject:' read-messages | cut -c10-80</B
|
|
></TT
|
|
>
|
|
<TT
|
|
CLASS="COMPUTEROUTPUT"
|
|
>Re: Linux suitable for mission-critical apps?
|
|
MAKE MILLIONS WORKING AT HOME!!!
|
|
Spam complaint
|
|
Re: Spam complaint</TT
|
|
></PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
>
|
|
</P
|
|
><P
|
|
>Using <B
|
|
CLASS="COMMAND"
|
|
>cut</B
|
|
> to parse a file:
|
|
<TABLE
|
|
BORDER="0"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="PROGRAMLISTING"
|
|
># List all the users in /etc/passwd.
|
|
|
|
FILENAME=/etc/passwd
|
|
|
|
for user in $(cut -d: -f1 $FILENAME)
|
|
do
|
|
echo $user
|
|
done
|
|
|
|
# Thanks, Oleg Philon for suggesting this.</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
></P
|
|
><P
|
|
><TT
|
|
CLASS="USERINPUT"
|
|
><B
|
|
>cut -d ' ' -f2,3 filename</B
|
|
></TT
|
|
> is equivalent to
|
|
<TT
|
|
CLASS="USERINPUT"
|
|
><B
|
|
>awk -F'[ ]' '{ print $2, $3 }' filename</B
|
|
></TT
|
|
></P
|
|
><DIV
|
|
CLASS="NOTE"
|
|
><P
|
|
></P
|
|
><TABLE
|
|
CLASS="NOTE"
|
|
WIDTH="90%"
|
|
BORDER="0"
|
|
><TR
|
|
><TD
|
|
WIDTH="25"
|
|
ALIGN="CENTER"
|
|
VALIGN="TOP"
|
|
><IMG
|
|
SRC="../images/note.gif"
|
|
HSPACE="5"
|
|
ALT="Note"></TD
|
|
><TD
|
|
ALIGN="LEFT"
|
|
VALIGN="TOP"
|
|
><P
|
|
>It is even possible to specify a linefeed as a
|
|
delimiter. The trick is to actually embed a linefeed
|
|
(<B
|
|
CLASS="KEYCAP"
|
|
>RETURN</B
|
|
>) in the command sequence.</P
|
|
><P
|
|
> <TABLE
|
|
BORDER="1"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="SCREEN"
|
|
><TT
|
|
CLASS="PROMPT"
|
|
>bash$ </TT
|
|
><TT
|
|
CLASS="USERINPUT"
|
|
><B
|
|
>cut -d'
|
|
' -f3,7,19 testfile</B
|
|
></TT
|
|
>
|
|
<TT
|
|
CLASS="COMPUTEROUTPUT"
|
|
>This is line 3 of testfile.
|
|
This is line 7 of testfile.
|
|
This is line 19 of testfile.</TT
|
|
>
|
|
</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
>
|
|
</P
|
|
><P
|
|
>Thank you, Jaka Kranjc, for pointing this out.</P
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
></DIV
|
|
><P
|
|
>See also <A
|
|
HREF="mathc.html#BASE"
|
|
>Example 16-48</A
|
|
>.</P
|
|
></DD
|
|
><DT
|
|
><A
|
|
NAME="PASTEREF"
|
|
></A
|
|
><B
|
|
CLASS="COMMAND"
|
|
>paste</B
|
|
></DT
|
|
><DD
|
|
><P
|
|
>Tool for merging together different files into a single,
|
|
multi-column file. In combination with
|
|
<A
|
|
HREF="textproc.html#CUTREF"
|
|
>cut</A
|
|
>, useful for creating system log
|
|
files.
|
|
</P
|
|
><P
|
|
> <TABLE
|
|
BORDER="1"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="SCREEN"
|
|
><TT
|
|
CLASS="PROMPT"
|
|
>bash$ </TT
|
|
><TT
|
|
CLASS="USERINPUT"
|
|
><B
|
|
>cat items</B
|
|
></TT
|
|
>
|
|
<TT
|
|
CLASS="COMPUTEROUTPUT"
|
|
>alphabet blocks
|
|
building blocks
|
|
cables</TT
|
|
>
|
|
|
|
<TT
|
|
CLASS="PROMPT"
|
|
>bash$ </TT
|
|
><TT
|
|
CLASS="USERINPUT"
|
|
><B
|
|
>cat prices</B
|
|
></TT
|
|
>
|
|
<TT
|
|
CLASS="COMPUTEROUTPUT"
|
|
>$1.00/dozen
|
|
$2.50 ea.
|
|
$3.75</TT
|
|
>
|
|
|
|
<TT
|
|
CLASS="PROMPT"
|
|
>bash$ </TT
|
|
><TT
|
|
CLASS="USERINPUT"
|
|
><B
|
|
>paste items prices</B
|
|
></TT
|
|
>
|
|
<TT
|
|
CLASS="COMPUTEROUTPUT"
|
|
>alphabet blocks $1.00/dozen
|
|
building blocks $2.50 ea.
|
|
cables $3.75</TT
|
|
></PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
>
|
|
</P
|
|
></DD
|
|
><DT
|
|
><A
|
|
NAME="JOINREF"
|
|
></A
|
|
><B
|
|
CLASS="COMMAND"
|
|
>join</B
|
|
></DT
|
|
><DD
|
|
><P
|
|
>Consider this a special-purpose cousin of
|
|
<B
|
|
CLASS="COMMAND"
|
|
>paste</B
|
|
>. This powerful utility allows
|
|
merging two files in a meaningful fashion, which essentially
|
|
creates a simple version of a relational database.</P
|
|
><P
|
|
>The <B
|
|
CLASS="COMMAND"
|
|
>join</B
|
|
> command operates on
|
|
exactly two files, but pastes together only those lines
|
|
with a common tagged <A
|
|
HREF="special-chars.html#FIELDREF"
|
|
>field</A
|
|
>
|
|
(usually a numerical label), and writes the result to
|
|
<TT
|
|
CLASS="FILENAME"
|
|
>stdout</TT
|
|
>. The files to be joined should
|
|
be sorted according to the tagged field for the matchups
|
|
to work properly.</P
|
|
><P
|
|
><TABLE
|
|
BORDER="0"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="PROGRAMLISTING"
|
|
>File: 1.data
|
|
|
|
100 Shoes
|
|
200 Laces
|
|
300 Socks</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
></P
|
|
><P
|
|
><TABLE
|
|
BORDER="0"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="PROGRAMLISTING"
|
|
>File: 2.data
|
|
|
|
100 $40.00
|
|
200 $1.00
|
|
300 $2.00</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
></P
|
|
><P
|
|
> <TABLE
|
|
BORDER="1"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="SCREEN"
|
|
><TT
|
|
CLASS="PROMPT"
|
|
>bash$ </TT
|
|
><TT
|
|
CLASS="USERINPUT"
|
|
><B
|
|
>join 1.data 2.data</B
|
|
></TT
|
|
>
|
|
<TT
|
|
CLASS="COMPUTEROUTPUT"
|
|
>File: 1.data 2.data
|
|
|
|
100 Shoes $40.00
|
|
200 Laces $1.00
|
|
300 Socks $2.00</TT
|
|
>
|
|
</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
>
|
|
</P
|
|
><DIV
|
|
CLASS="NOTE"
|
|
><P
|
|
></P
|
|
><TABLE
|
|
CLASS="NOTE"
|
|
WIDTH="90%"
|
|
BORDER="0"
|
|
><TR
|
|
><TD
|
|
WIDTH="25"
|
|
ALIGN="CENTER"
|
|
VALIGN="TOP"
|
|
><IMG
|
|
SRC="../images/note.gif"
|
|
HSPACE="5"
|
|
ALT="Note"></TD
|
|
><TD
|
|
ALIGN="LEFT"
|
|
VALIGN="TOP"
|
|
><P
|
|
>The tagged field appears only once in the
|
|
output.</P
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
></DIV
|
|
></DD
|
|
><DT
|
|
><A
|
|
NAME="HEADREF"
|
|
></A
|
|
><B
|
|
CLASS="COMMAND"
|
|
>head</B
|
|
></DT
|
|
><DD
|
|
><P
|
|
>lists the beginning of a file to <TT
|
|
CLASS="FILENAME"
|
|
>stdout</TT
|
|
>.
|
|
The default is <TT
|
|
CLASS="LITERAL"
|
|
>10</TT
|
|
> lines, but a different
|
|
number can be specified. The command has a number of
|
|
interesting options.
|
|
|
|
<DIV
|
|
CLASS="EXAMPLE"
|
|
><A
|
|
NAME="SCRIPTDETECTOR"
|
|
></A
|
|
><P
|
|
><B
|
|
>Example 16-13. Which files are scripts?</B
|
|
></P
|
|
><TABLE
|
|
BORDER="0"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="PROGRAMLISTING"
|
|
>#!/bin/bash
|
|
# script-detector.sh: Detects scripts within a directory.
|
|
|
|
TESTCHARS=2 # Test first 2 characters.
|
|
SHABANG='#!' # Scripts begin with a "sha-bang."
|
|
|
|
for file in * # Traverse all the files in current directory.
|
|
do
|
|
if [[ `head -c$TESTCHARS "$file"` = "$SHABANG" ]]
|
|
# head -c2 #!
|
|
# The '-c' option to "head" outputs a specified
|
|
#+ number of characters, rather than lines (the default).
|
|
then
|
|
echo "File \"$file\" is a script."
|
|
else
|
|
echo "File \"$file\" is *not* a script."
|
|
fi
|
|
done
|
|
|
|
exit 0
|
|
|
|
# Exercises:
|
|
# ---------
|
|
# 1) Modify this script to take as an optional argument
|
|
#+ the directory to scan for scripts
|
|
#+ (rather than just the current working directory).
|
|
#
|
|
# 2) As it stands, this script gives "false positives" for
|
|
#+ Perl, awk, and other scripting language scripts.
|
|
# Correct this.</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
></DIV
|
|
>
|
|
|
|
<DIV
|
|
CLASS="EXAMPLE"
|
|
><A
|
|
NAME="RND"
|
|
></A
|
|
><P
|
|
><B
|
|
>Example 16-14. Generating 10-digit random numbers</B
|
|
></P
|
|
><TABLE
|
|
BORDER="0"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="PROGRAMLISTING"
|
|
>#!/bin/bash
|
|
# rnd.sh: Outputs a 10-digit random number
|
|
|
|
# Script by Stephane Chazelas.
|
|
|
|
head -c4 /dev/urandom | od -N4 -tu4 | sed -ne '1s/.* //p'
|
|
|
|
|
|
# =================================================================== #
|
|
|
|
# Analysis
|
|
# --------
|
|
|
|
# head:
|
|
# -c4 option takes first 4 bytes.
|
|
|
|
# od:
|
|
# -N4 option limits output to 4 bytes.
|
|
# -tu4 option selects unsigned decimal format for output.
|
|
|
|
# sed:
|
|
# -n option, in combination with "p" flag to the "s" command,
|
|
# outputs only matched lines.
|
|
|
|
|
|
|
|
# The author of this script explains the action of 'sed', as follows.
|
|
|
|
# head -c4 /dev/urandom | od -N4 -tu4 | sed -ne '1s/.* //p'
|
|
# ----------------------------------> |
|
|
|
|
# Assume output up to "sed" --------> |
|
|
# is 0000000 1198195154\n
|
|
|
|
# sed begins reading characters: 0000000 1198195154\n.
|
|
# Here it finds a newline character,
|
|
#+ so it is ready to process the first line (0000000 1198195154).
|
|
# It looks at its <range><action>s. The first and only one is
|
|
|
|
# range action
|
|
# 1 s/.* //p
|
|
|
|
# The line number is in the range, so it executes the action:
|
|
#+ tries to substitute the longest string ending with a space in the line
|
|
# ("0000000 ") with nothing (//), and if it succeeds, prints the result
|
|
# ("p" is a flag to the "s" command here, this is different
|
|
#+ from the "p" command).
|
|
|
|
# sed is now ready to continue reading its input. (Note that before
|
|
#+ continuing, if -n option had not been passed, sed would have printed
|
|
#+ the line once again).
|
|
|
|
# Now, sed reads the remainder of the characters, and finds the
|
|
#+ end of the file.
|
|
# It is now ready to process its 2nd line (which is also numbered '$' as
|
|
#+ it's the last one).
|
|
# It sees it is not matched by any <range>, so its job is done.
|
|
|
|
# In few word this sed commmand means:
|
|
# "On the first line only, remove any character up to the right-most space,
|
|
#+ then print it."
|
|
|
|
# A better way to do this would have been:
|
|
# sed -e 's/.* //;q'
|
|
|
|
# Here, two <range><action>s (could have been written
|
|
# sed -e 's/.* //' -e q):
|
|
|
|
# range action
|
|
# nothing (matches line) s/.* //
|
|
# nothing (matches line) q (quit)
|
|
|
|
# Here, sed only reads its first line of input.
|
|
# It performs both actions, and prints the line (substituted) before
|
|
#+ quitting (because of the "q" action) since the "-n" option is not passed.
|
|
|
|
# =================================================================== #
|
|
|
|
# An even simpler altenative to the above one-line script would be:
|
|
# head -c4 /dev/urandom| od -An -tu4
|
|
|
|
exit</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
></DIV
|
|
>
|
|
|
|
See also <A
|
|
HREF="filearchiv.html#EX52"
|
|
>Example 16-39</A
|
|
>.</P
|
|
></DD
|
|
><DT
|
|
><A
|
|
NAME="TAILREF"
|
|
></A
|
|
><B
|
|
CLASS="COMMAND"
|
|
>tail</B
|
|
></DT
|
|
><DD
|
|
><P
|
|
>lists the (tail) end of a file to <TT
|
|
CLASS="FILENAME"
|
|
>stdout</TT
|
|
>.
|
|
The default is <TT
|
|
CLASS="LITERAL"
|
|
>10</TT
|
|
> lines, but this can
|
|
be changed with the <TT
|
|
CLASS="OPTION"
|
|
>-n</TT
|
|
> option.
|
|
Commonly used to keep track of
|
|
changes to a system logfile, using the <TT
|
|
CLASS="OPTION"
|
|
>-f</TT
|
|
>
|
|
option, which outputs lines appended to the file.</P
|
|
><DIV
|
|
CLASS="EXAMPLE"
|
|
><A
|
|
NAME="EX12"
|
|
></A
|
|
><P
|
|
><B
|
|
>Example 16-15. Using <I
|
|
CLASS="FIRSTTERM"
|
|
>tail</I
|
|
> to monitor the system log</B
|
|
></P
|
|
><TABLE
|
|
BORDER="0"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="PROGRAMLISTING"
|
|
>#!/bin/bash
|
|
|
|
filename=sys.log
|
|
|
|
cat /dev/null > $filename; echo "Creating / cleaning out file."
|
|
# Creates the file if it does not already exist,
|
|
#+ and truncates it to zero length if it does.
|
|
# : > filename and > filename also work.
|
|
|
|
tail /var/log/messages > $filename
|
|
# /var/log/messages must have world read permission for this to work.
|
|
|
|
echo "$filename contains tail end of system log."
|
|
|
|
exit 0</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
></DIV
|
|
><DIV
|
|
CLASS="TIP"
|
|
><P
|
|
></P
|
|
><TABLE
|
|
CLASS="TIP"
|
|
WIDTH="90%"
|
|
BORDER="0"
|
|
><TR
|
|
><TD
|
|
WIDTH="25"
|
|
ALIGN="CENTER"
|
|
VALIGN="TOP"
|
|
><IMG
|
|
SRC="../images/tip.gif"
|
|
HSPACE="5"
|
|
ALT="Tip"></TD
|
|
><TD
|
|
ALIGN="LEFT"
|
|
VALIGN="TOP"
|
|
><P
|
|
>To list a specific line of a text file,
|
|
<A
|
|
HREF="special-chars.html#PIPEREF"
|
|
>pipe</A
|
|
> the output of
|
|
<B
|
|
CLASS="COMMAND"
|
|
>head</B
|
|
> to <B
|
|
CLASS="COMMAND"
|
|
>tail -n 1</B
|
|
>.
|
|
For example <TT
|
|
CLASS="USERINPUT"
|
|
><B
|
|
>head -n 8 database.txt | tail
|
|
-n 1</B
|
|
></TT
|
|
> lists the 8th line of the file
|
|
<TT
|
|
CLASS="FILENAME"
|
|
>database.txt</TT
|
|
>.</P
|
|
><P
|
|
>To set a variable to a given block of a text file:
|
|
<TABLE
|
|
BORDER="0"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="PROGRAMLISTING"
|
|
>var=$(head -n $m $filename | tail -n $n)
|
|
|
|
# filename = name of file
|
|
# m = from beginning of file, number of lines to end of block
|
|
# n = number of lines to set variable to (trim from end of block)</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
></P
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
></DIV
|
|
><DIV
|
|
CLASS="NOTE"
|
|
><P
|
|
></P
|
|
><TABLE
|
|
CLASS="NOTE"
|
|
WIDTH="90%"
|
|
BORDER="0"
|
|
><TR
|
|
><TD
|
|
WIDTH="25"
|
|
ALIGN="CENTER"
|
|
VALIGN="TOP"
|
|
><IMG
|
|
SRC="../images/note.gif"
|
|
HSPACE="5"
|
|
ALT="Note"></TD
|
|
><TD
|
|
ALIGN="LEFT"
|
|
VALIGN="TOP"
|
|
><P
|
|
>Newer implementations of <B
|
|
CLASS="COMMAND"
|
|
>tail</B
|
|
>
|
|
deprecate the older <B
|
|
CLASS="COMMAND"
|
|
>tail -$LINES
|
|
filename</B
|
|
> usage. The standard <B
|
|
CLASS="COMMAND"
|
|
>tail -n $LINES
|
|
filename</B
|
|
> is correct.</P
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
></DIV
|
|
><P
|
|
>See also <A
|
|
HREF="moreadv.html#EX41"
|
|
>Example 16-5</A
|
|
>, <A
|
|
HREF="filearchiv.html#EX52"
|
|
>Example 16-39</A
|
|
> and
|
|
<A
|
|
HREF="debugging.html#ONLINE"
|
|
>Example 32-6</A
|
|
>.</P
|
|
></DD
|
|
><DT
|
|
><A
|
|
NAME="GREPREF"
|
|
></A
|
|
><B
|
|
CLASS="COMMAND"
|
|
>grep</B
|
|
></DT
|
|
><DD
|
|
><P
|
|
>A multi-purpose file search tool that uses
|
|
<A
|
|
HREF="regexp.html#REGEXREF"
|
|
>Regular Expressions</A
|
|
>.
|
|
It was originally a command/filter in the
|
|
venerable <B
|
|
CLASS="COMMAND"
|
|
>ed</B
|
|
> line editor:
|
|
<TT
|
|
CLASS="USERINPUT"
|
|
><B
|
|
>g/re/p</B
|
|
></TT
|
|
> -- <I
|
|
CLASS="FIRSTTERM"
|
|
>global -
|
|
regular expression - print</I
|
|
>.</P
|
|
><P
|
|
><P
|
|
><B
|
|
CLASS="COMMAND"
|
|
>grep</B
|
|
> <TT
|
|
CLASS="REPLACEABLE"
|
|
><I
|
|
>pattern</I
|
|
></TT
|
|
> [<TT
|
|
CLASS="REPLACEABLE"
|
|
><I
|
|
>file</I
|
|
></TT
|
|
>...]</P
|
|
>Search the target file(s) for
|
|
occurrences of <TT
|
|
CLASS="REPLACEABLE"
|
|
><I
|
|
>pattern</I
|
|
></TT
|
|
>, where
|
|
<TT
|
|
CLASS="REPLACEABLE"
|
|
><I
|
|
>pattern</I
|
|
></TT
|
|
> may be literal text
|
|
or a Regular Expression.</P
|
|
><P
|
|
> <TABLE
|
|
BORDER="1"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="SCREEN"
|
|
><TT
|
|
CLASS="PROMPT"
|
|
>bash$ </TT
|
|
><TT
|
|
CLASS="USERINPUT"
|
|
><B
|
|
>grep '[rst]ystem.$' osinfo.txt</B
|
|
></TT
|
|
>
|
|
<TT
|
|
CLASS="COMPUTEROUTPUT"
|
|
>The GPL governs the distribution of the Linux operating system.</TT
|
|
>
|
|
</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
>
|
|
</P
|
|
><P
|
|
>If no target file(s) specified, <B
|
|
CLASS="COMMAND"
|
|
>grep</B
|
|
>
|
|
works as a filter on <TT
|
|
CLASS="FILENAME"
|
|
>stdout</TT
|
|
>, as in
|
|
a <A
|
|
HREF="special-chars.html#PIPEREF"
|
|
>pipe</A
|
|
>.</P
|
|
><P
|
|
> <TABLE
|
|
BORDER="1"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="SCREEN"
|
|
><TT
|
|
CLASS="PROMPT"
|
|
>bash$ </TT
|
|
><TT
|
|
CLASS="USERINPUT"
|
|
><B
|
|
>ps ax | grep clock</B
|
|
></TT
|
|
>
|
|
<TT
|
|
CLASS="COMPUTEROUTPUT"
|
|
>765 tty1 S 0:00 xclock
|
|
901 pts/1 S 0:00 grep clock</TT
|
|
>
|
|
</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
>
|
|
</P
|
|
><P
|
|
>The <TT
|
|
CLASS="OPTION"
|
|
>-i</TT
|
|
> option causes a case-insensitive
|
|
search.</P
|
|
><P
|
|
>The <TT
|
|
CLASS="OPTION"
|
|
>-w</TT
|
|
> option matches only whole
|
|
words.</P
|
|
><P
|
|
>The <TT
|
|
CLASS="OPTION"
|
|
>-l</TT
|
|
> option lists only the files in which
|
|
matches were found, but not the matching lines.</P
|
|
><P
|
|
>The <TT
|
|
CLASS="OPTION"
|
|
>-r</TT
|
|
> (recursive) option searches files in
|
|
the current working directory and all subdirectories below
|
|
it.</P
|
|
><P
|
|
>The <TT
|
|
CLASS="OPTION"
|
|
>-n</TT
|
|
> option lists the matching lines,
|
|
together with line numbers.</P
|
|
><P
|
|
> <TABLE
|
|
BORDER="1"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="SCREEN"
|
|
><TT
|
|
CLASS="PROMPT"
|
|
>bash$ </TT
|
|
><TT
|
|
CLASS="USERINPUT"
|
|
><B
|
|
>grep -n Linux osinfo.txt</B
|
|
></TT
|
|
>
|
|
<TT
|
|
CLASS="COMPUTEROUTPUT"
|
|
>2:This is a file containing information about Linux.
|
|
6:The GPL governs the distribution of the Linux operating system.</TT
|
|
>
|
|
</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
>
|
|
</P
|
|
><P
|
|
>The <TT
|
|
CLASS="OPTION"
|
|
>-v</TT
|
|
> (or <TT
|
|
CLASS="OPTION"
|
|
>--invert-match</TT
|
|
>)
|
|
option <I
|
|
CLASS="FIRSTTERM"
|
|
>filters out</I
|
|
> matches.
|
|
<TABLE
|
|
BORDER="0"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="PROGRAMLISTING"
|
|
>grep pattern1 *.txt | grep -v pattern2
|
|
|
|
# Matches all lines in "*.txt" files containing "pattern1",
|
|
# but ***not*** "pattern2". </PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
></P
|
|
><P
|
|
>The <TT
|
|
CLASS="OPTION"
|
|
>-c</TT
|
|
> (<TT
|
|
CLASS="OPTION"
|
|
>--count</TT
|
|
>)
|
|
option gives a numerical count of matches, rather than
|
|
actually listing the matches.
|
|
|
|
<TABLE
|
|
BORDER="0"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="PROGRAMLISTING"
|
|
>grep -c txt *.sgml # (number of occurrences of "txt" in "*.sgml" files)
|
|
|
|
|
|
# grep -cz .
|
|
# ^ dot
|
|
# means count (-c) zero-separated (-z) items matching "."
|
|
# that is, non-empty ones (containing at least 1 character).
|
|
#
|
|
printf 'a b\nc d\n\n\n\n\n\000\n\000e\000\000\nf' | grep -cz . # 3
|
|
printf 'a b\nc d\n\n\n\n\n\000\n\000e\000\000\nf' | grep -cz '$' # 5
|
|
printf 'a b\nc d\n\n\n\n\n\000\n\000e\000\000\nf' | grep -cz '^' # 5
|
|
#
|
|
printf 'a b\nc d\n\n\n\n\n\000\n\000e\000\000\nf' | grep -c '$' # 9
|
|
# By default, newline chars (\n) separate items to match.
|
|
|
|
# Note that the -z option is GNU "grep" specific.
|
|
|
|
|
|
# Thanks, S.C.</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
>
|
|
</P
|
|
><P
|
|
>The <TT
|
|
CLASS="OPTION"
|
|
>--color</TT
|
|
> (or <TT
|
|
CLASS="OPTION"
|
|
>--colour</TT
|
|
>)
|
|
option marks the matching string in color (on the console
|
|
or in an <I
|
|
CLASS="FIRSTTERM"
|
|
>xterm</I
|
|
> window). Since
|
|
<I
|
|
CLASS="FIRSTTERM"
|
|
>grep</I
|
|
> prints out each entire line
|
|
containing the matching pattern, this lets you see exactly
|
|
<EM
|
|
>what</EM
|
|
> is being matched. See also
|
|
the <TT
|
|
CLASS="OPTION"
|
|
>-o</TT
|
|
> option, which shows only the
|
|
matching portion of the line(s).</P
|
|
><DIV
|
|
CLASS="EXAMPLE"
|
|
><A
|
|
NAME="FROMSH"
|
|
></A
|
|
><P
|
|
><B
|
|
>Example 16-16. Printing out the <I
|
|
CLASS="FIRSTTERM"
|
|
>From</I
|
|
> lines in
|
|
stored e-mail messages</B
|
|
></P
|
|
><TABLE
|
|
BORDER="0"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="PROGRAMLISTING"
|
|
>#!/bin/bash
|
|
# from.sh
|
|
|
|
# Emulates the useful 'from' utility in Solaris, BSD, etc.
|
|
# Echoes the "From" header line in all messages
|
|
#+ in your e-mail directory.
|
|
|
|
|
|
MAILDIR=~/mail/* # No quoting of variable. Why?
|
|
# Maybe check if-exists $MAILDIR: if [ -d $MAILDIR ] . . .
|
|
GREP_OPTS="-H -A 5 --color" # Show file, plus extra context lines
|
|
#+ and display "From" in color.
|
|
TARGETSTR="^From" # "From" at beginning of line.
|
|
|
|
for file in $MAILDIR # No quoting of variable.
|
|
do
|
|
grep $GREP_OPTS "$TARGETSTR" "$file"
|
|
# ^^^^^^^^^^ # Again, do not quote this variable.
|
|
echo
|
|
done
|
|
|
|
exit $?
|
|
|
|
# You might wish to pipe the output of this script to 'more'
|
|
#+ or redirect it to a file . . .</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
></DIV
|
|
><P
|
|
>When invoked with more than one target file given,
|
|
<B
|
|
CLASS="COMMAND"
|
|
>grep</B
|
|
> specifies which file contains
|
|
matches.</P
|
|
><P
|
|
> <TABLE
|
|
BORDER="1"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="SCREEN"
|
|
><TT
|
|
CLASS="PROMPT"
|
|
>bash$ </TT
|
|
><TT
|
|
CLASS="USERINPUT"
|
|
><B
|
|
>grep Linux osinfo.txt misc.txt</B
|
|
></TT
|
|
>
|
|
<TT
|
|
CLASS="COMPUTEROUTPUT"
|
|
>osinfo.txt:This is a file containing information about Linux.
|
|
osinfo.txt:The GPL governs the distribution of the Linux operating system.
|
|
misc.txt:The Linux operating system is steadily gaining in popularity.</TT
|
|
>
|
|
</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
>
|
|
</P
|
|
><DIV
|
|
CLASS="TIP"
|
|
><P
|
|
></P
|
|
><TABLE
|
|
CLASS="TIP"
|
|
WIDTH="90%"
|
|
BORDER="0"
|
|
><TR
|
|
><TD
|
|
WIDTH="25"
|
|
ALIGN="CENTER"
|
|
VALIGN="TOP"
|
|
><IMG
|
|
SRC="../images/tip.gif"
|
|
HSPACE="5"
|
|
ALT="Tip"></TD
|
|
><TD
|
|
ALIGN="LEFT"
|
|
VALIGN="TOP"
|
|
><P
|
|
>To force <B
|
|
CLASS="COMMAND"
|
|
>grep</B
|
|
> to show the filename
|
|
when searching only one target file, simply give
|
|
<TT
|
|
CLASS="FILENAME"
|
|
>/dev/null</TT
|
|
> as the second file.</P
|
|
><P
|
|
> <TABLE
|
|
BORDER="1"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="SCREEN"
|
|
><TT
|
|
CLASS="PROMPT"
|
|
>bash$ </TT
|
|
><TT
|
|
CLASS="USERINPUT"
|
|
><B
|
|
>grep Linux osinfo.txt /dev/null</B
|
|
></TT
|
|
>
|
|
<TT
|
|
CLASS="COMPUTEROUTPUT"
|
|
>osinfo.txt:This is a file containing information about Linux.
|
|
osinfo.txt:The GPL governs the distribution of the Linux operating system.</TT
|
|
>
|
|
</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
>
|
|
</P
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
></DIV
|
|
><P
|
|
>If there is a successful match, <B
|
|
CLASS="COMMAND"
|
|
>grep</B
|
|
>
|
|
returns an <A
|
|
HREF="exit-status.html#EXITSTATUSREF"
|
|
>exit status</A
|
|
>
|
|
of 0, which makes it useful in a condition test in a
|
|
script, especially in combination with the <TT
|
|
CLASS="OPTION"
|
|
>-q</TT
|
|
>
|
|
option to suppress output.
|
|
<TABLE
|
|
BORDER="0"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="PROGRAMLISTING"
|
|
>SUCCESS=0 # if grep lookup succeeds
|
|
word=Linux
|
|
filename=data.file
|
|
|
|
grep -q "$word" "$filename" # The "-q" option
|
|
#+ causes nothing to echo to stdout.
|
|
if [ $? -eq $SUCCESS ]
|
|
# if grep -q "$word" "$filename" can replace lines 5 - 7.
|
|
then
|
|
echo "$word found in $filename"
|
|
else
|
|
echo "$word not found in $filename"
|
|
fi</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
>
|
|
</P
|
|
><P
|
|
><A
|
|
HREF="debugging.html#ONLINE"
|
|
>Example 32-6</A
|
|
> demonstrates how to use
|
|
<B
|
|
CLASS="COMMAND"
|
|
>grep</B
|
|
> to search for a word pattern in
|
|
a system logfile.</P
|
|
><DIV
|
|
CLASS="EXAMPLE"
|
|
><A
|
|
NAME="GRP"
|
|
></A
|
|
><P
|
|
><B
|
|
>Example 16-17. Emulating <I
|
|
CLASS="FIRSTTERM"
|
|
>grep</I
|
|
> in a script</B
|
|
></P
|
|
><TABLE
|
|
BORDER="0"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="PROGRAMLISTING"
|
|
>#!/bin/bash
|
|
# grp.sh: Rudimentary reimplementation of grep.
|
|
|
|
E_BADARGS=85
|
|
|
|
if [ -z "$1" ] # Check for argument to script.
|
|
then
|
|
echo "Usage: `basename $0` pattern"
|
|
exit $E_BADARGS
|
|
fi
|
|
|
|
echo
|
|
|
|
for file in * # Traverse all files in $PWD.
|
|
do
|
|
output=$(sed -n /"$1"/p $file) # Command substitution.
|
|
|
|
if [ ! -z "$output" ] # What happens if "$output" is not quoted?
|
|
then
|
|
echo -n "$file: "
|
|
echo "$output"
|
|
fi # sed -ne "/$1/s|^|${file}: |p" is equivalent to above.
|
|
|
|
echo
|
|
done
|
|
|
|
echo
|
|
|
|
exit 0
|
|
|
|
# Exercises:
|
|
# ---------
|
|
# 1) Add newlines to output, if more than one match in any given file.
|
|
# 2) Add features.</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
></DIV
|
|
><P
|
|
>How can <B
|
|
CLASS="COMMAND"
|
|
>grep</B
|
|
> search for two (or
|
|
more) separate patterns? What if you want
|
|
<B
|
|
CLASS="COMMAND"
|
|
>grep</B
|
|
> to display all lines in a file
|
|
or files that contain both <SPAN
|
|
CLASS="QUOTE"
|
|
>"pattern1"</SPAN
|
|
>
|
|
<EM
|
|
>and</EM
|
|
> <SPAN
|
|
CLASS="QUOTE"
|
|
>"pattern2"</SPAN
|
|
>?</P
|
|
><P
|
|
>One method is to <A
|
|
HREF="special-chars.html#PIPEREF"
|
|
>pipe</A
|
|
> the result of <B
|
|
CLASS="COMMAND"
|
|
>grep
|
|
pattern1</B
|
|
> to <B
|
|
CLASS="COMMAND"
|
|
>grep pattern2</B
|
|
>.</P
|
|
><P
|
|
>For example, given the following file:</P
|
|
><P
|
|
> <TABLE
|
|
BORDER="0"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="PROGRAMLISTING"
|
|
># Filename: tstfile
|
|
|
|
This is a sample file.
|
|
This is an ordinary text file.
|
|
This file does not contain any unusual text.
|
|
This file is not unusual.
|
|
Here is some text.</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
>
|
|
</P
|
|
><P
|
|
>Now, let's search this file for lines containing
|
|
<EM
|
|
>both</EM
|
|
> <SPAN
|
|
CLASS="QUOTE"
|
|
>"file"</SPAN
|
|
> and
|
|
<SPAN
|
|
CLASS="QUOTE"
|
|
>"text"</SPAN
|
|
> . . . </P
|
|
><TABLE
|
|
BORDER="1"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="SCREEN"
|
|
><TT
|
|
CLASS="PROMPT"
|
|
>bash$ </TT
|
|
><TT
|
|
CLASS="USERINPUT"
|
|
><B
|
|
>grep file tstfile</B
|
|
></TT
|
|
>
|
|
<TT
|
|
CLASS="COMPUTEROUTPUT"
|
|
># Filename: tstfile
|
|
This is a sample file.
|
|
This is an ordinary text file.
|
|
This file does not contain any unusual text.
|
|
This file is not unusual.</TT
|
|
>
|
|
|
|
<TT
|
|
CLASS="PROMPT"
|
|
>bash$ </TT
|
|
><TT
|
|
CLASS="USERINPUT"
|
|
><B
|
|
>grep file tstfile | grep text</B
|
|
></TT
|
|
>
|
|
<TT
|
|
CLASS="COMPUTEROUTPUT"
|
|
>This is an ordinary text file.
|
|
This file does not contain any unusual text.</TT
|
|
></PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
><P
|
|
>Now, for an interesting recreational use
|
|
of <I
|
|
CLASS="FIRSTTERM"
|
|
>grep</I
|
|
> . . .</P
|
|
><DIV
|
|
CLASS="EXAMPLE"
|
|
><A
|
|
NAME="CWSOLVER"
|
|
></A
|
|
><P
|
|
><B
|
|
>Example 16-18. Crossword puzzle solver</B
|
|
></P
|
|
><TABLE
|
|
BORDER="0"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="PROGRAMLISTING"
|
|
>#!/bin/bash
|
|
# cw-solver.sh
|
|
# This is actually a wrapper around a one-liner (line 46).
|
|
|
|
# Crossword puzzle and anagramming word game solver.
|
|
# You know *some* of the letters in the word you're looking for,
|
|
#+ so you need a list of all valid words
|
|
#+ with the known letters in given positions.
|
|
# For example: w...i....n
|
|
# 1???5????10
|
|
# w in position 1, 3 unknowns, i in the 5th, 4 unknowns, n at the end.
|
|
# (See comments at end of script.)
|
|
|
|
|
|
E_NOPATT=71
|
|
DICT=/usr/share/dict/word.lst
|
|
# ^^^^^^^^ Looks for word list here.
|
|
# ASCII word list, one word per line.
|
|
# If you happen to need an appropriate list,
|
|
#+ download the author's "yawl" word list package.
|
|
# http://ibiblio.org/pub/Linux/libs/yawl-0.3.2.tar.gz
|
|
# or
|
|
# http://bash.deta.in/yawl-0.3.2.tar.gz
|
|
|
|
|
|
if [ -z "$1" ] # If no word pattern specified
|
|
then #+ as a command-line argument . . .
|
|
echo #+ . . . then . . .
|
|
echo "Usage:" #+ Usage message.
|
|
echo
|
|
echo ""$0" \"pattern,\""
|
|
echo "where \"pattern\" is in the form"
|
|
echo "xxx..x.x..."
|
|
echo
|
|
echo "The x's represent known letters,"
|
|
echo "and the periods are unknown letters (blanks)."
|
|
echo "Letters and periods can be in any position."
|
|
echo "For example, try: sh cw-solver.sh w...i....n"
|
|
echo
|
|
exit $E_NOPATT
|
|
fi
|
|
|
|
echo
|
|
# ===============================================
|
|
# This is where all the work gets done.
|
|
grep ^"$1"$ "$DICT" # Yes, only one line!
|
|
# | |
|
|
# ^ is start-of-word regex anchor.
|
|
# $ is end-of-word regex anchor.
|
|
|
|
# From _Stupid Grep Tricks_, vol. 1,
|
|
#+ a book the ABS Guide author may yet get around
|
|
#+ to writing . . . one of these days . . .
|
|
# ===============================================
|
|
echo
|
|
|
|
|
|
exit $? # Script terminates here.
|
|
# If there are too many words generated,
|
|
#+ redirect the output to a file.
|
|
|
|
$ sh cw-solver.sh w...i....n
|
|
|
|
wellington
|
|
workingman
|
|
workingmen</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
></DIV
|
|
><P
|
|
><A
|
|
NAME="EGREPREF"
|
|
></A
|
|
><B
|
|
CLASS="COMMAND"
|
|
>egrep</B
|
|
>
|
|
-- <I
|
|
CLASS="FIRSTTERM"
|
|
>extended grep</I
|
|
> -- is the same
|
|
as <B
|
|
CLASS="COMMAND"
|
|
>grep -E</B
|
|
>. This uses a somewhat
|
|
different, extended set of <A
|
|
HREF="regexp.html#REGEXREF"
|
|
>Regular
|
|
Expressions</A
|
|
>, which can make the search a bit more
|
|
flexible. It also allows the boolean |
|
|
(<I
|
|
CLASS="FIRSTTERM"
|
|
>or</I
|
|
>) operator.
|
|
<TABLE
|
|
BORDER="1"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="SCREEN"
|
|
><TT
|
|
CLASS="PROMPT"
|
|
>bash $ </TT
|
|
><TT
|
|
CLASS="USERINPUT"
|
|
><B
|
|
>egrep 'matches|Matches' file.txt</B
|
|
></TT
|
|
>
|
|
<TT
|
|
CLASS="COMPUTEROUTPUT"
|
|
>Line 1 matches.
|
|
Line 3 Matches.
|
|
Line 4 contains matches, but also Matches</TT
|
|
>
|
|
</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
>
|
|
</P
|
|
><P
|
|
><A
|
|
NAME="FGREPREF"
|
|
></A
|
|
><B
|
|
CLASS="COMMAND"
|
|
>fgrep</B
|
|
> --
|
|
<I
|
|
CLASS="FIRSTTERM"
|
|
>fast grep</I
|
|
> -- is the same as
|
|
<B
|
|
CLASS="COMMAND"
|
|
>grep -F</B
|
|
>. It does a literal string search
|
|
(no <A
|
|
HREF="regexp.html#REGEXREF"
|
|
>Regular Expressions</A
|
|
>),
|
|
which generally speeds things up a bit.</P
|
|
><DIV
|
|
CLASS="NOTE"
|
|
><P
|
|
></P
|
|
><TABLE
|
|
CLASS="NOTE"
|
|
WIDTH="90%"
|
|
BORDER="0"
|
|
><TR
|
|
><TD
|
|
WIDTH="25"
|
|
ALIGN="CENTER"
|
|
VALIGN="TOP"
|
|
><IMG
|
|
SRC="../images/note.gif"
|
|
HSPACE="5"
|
|
ALT="Note"></TD
|
|
><TD
|
|
ALIGN="LEFT"
|
|
VALIGN="TOP"
|
|
><P
|
|
>On some Linux distros, <B
|
|
CLASS="COMMAND"
|
|
>egrep</B
|
|
> and
|
|
<B
|
|
CLASS="COMMAND"
|
|
>fgrep</B
|
|
> are symbolic links to, or aliases for
|
|
<B
|
|
CLASS="COMMAND"
|
|
>grep</B
|
|
>, but invoked with the
|
|
<TT
|
|
CLASS="OPTION"
|
|
>-E</TT
|
|
> and <TT
|
|
CLASS="OPTION"
|
|
>-F</TT
|
|
> options,
|
|
respectively.</P
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
></DIV
|
|
><DIV
|
|
CLASS="EXAMPLE"
|
|
><A
|
|
NAME="DICTLOOKUP"
|
|
></A
|
|
><P
|
|
><B
|
|
>Example 16-19. Looking up definitions in <I
|
|
CLASS="CITETITLE"
|
|
>Webster's 1913 Dictionary</I
|
|
></B
|
|
></P
|
|
><TABLE
|
|
BORDER="0"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="PROGRAMLISTING"
|
|
>#!/bin/bash
|
|
# dict-lookup.sh
|
|
|
|
# This script looks up definitions in the 1913 Webster's Dictionary.
|
|
# This Public Domain dictionary is available for download
|
|
#+ from various sites, including
|
|
#+ Project Gutenberg (http://www.gutenberg.org/etext/247).
|
|
#
|
|
# Convert it from DOS to UNIX format (with only LF at end of line)
|
|
#+ before using it with this script.
|
|
# Store the file in plain, uncompressed ASCII text.
|
|
# Set DEFAULT_DICTFILE variable below to path/filename.
|
|
|
|
|
|
E_BADARGS=85
|
|
MAXCONTEXTLINES=50 # Maximum number of lines to show.
|
|
DEFAULT_DICTFILE="/usr/share/dict/webster1913-dict.txt"
|
|
# Default dictionary file pathname.
|
|
# Change this as necessary.
|
|
# Note:
|
|
# ----
|
|
# This particular edition of the 1913 Webster's
|
|
#+ begins each entry with an uppercase letter
|
|
#+ (lowercase for the remaining characters).
|
|
# Only the *very first line* of an entry begins this way,
|
|
#+ and that's why the search algorithm below works.
|
|
|
|
|
|
|
|
if [[ -z $(echo "$1" | sed -n '/^[A-Z]/p') ]]
|
|
# Must at least specify word to look up, and
|
|
#+ it must start with an uppercase letter.
|
|
then
|
|
echo "Usage: `basename $0` Word-to-define [dictionary-file]"
|
|
echo
|
|
echo "Note: Word to look up must start with capital letter,"
|
|
echo "with the rest of the word in lowercase."
|
|
echo "--------------------------------------------"
|
|
echo "Examples: Abandon, Dictionary, Marking, etc."
|
|
exit $E_BADARGS
|
|
fi
|
|
|
|
|
|
if [ -z "$2" ] # May specify different dictionary
|
|
#+ as an argument to this script.
|
|
then
|
|
dictfile=$DEFAULT_DICTFILE
|
|
else
|
|
dictfile="$2"
|
|
fi
|
|
|
|
# ---------------------------------------------------------
|
|
Definition=$(fgrep -A $MAXCONTEXTLINES "$1 \\" "$dictfile")
|
|
# Definitions in form "Word \..."
|
|
#
|
|
# And, yes, "fgrep" is fast enough
|
|
#+ to search even a very large text file.
|
|
|
|
|
|
# Now, snip out just the definition block.
|
|
|
|
echo "$Definition" |
|
|
sed -n '1,/^[A-Z]/p' |
|
|
# Print from first line of output
|
|
#+ to the first line of the next entry.
|
|
sed '$d' | sed '$d'
|
|
# Delete last two lines of output
|
|
#+ (blank line and first line of next entry).
|
|
# ---------------------------------------------------------
|
|
|
|
exit $?
|
|
|
|
# Exercises:
|
|
# ---------
|
|
# 1) Modify the script to accept any type of alphabetic input
|
|
# + (uppercase, lowercase, mixed case), and convert it
|
|
# + to an acceptable format for processing.
|
|
#
|
|
# 2) Convert the script to a GUI application,
|
|
# + using something like 'gdialog' or 'zenity' . . .
|
|
# The script will then no longer take its argument(s)
|
|
# + from the command-line.
|
|
#
|
|
# 3) Modify the script to parse one of the other available
|
|
# + Public Domain Dictionaries, such as the U.S. Census Bureau Gazetteer.</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
></DIV
|
|
><DIV
|
|
CLASS="NOTE"
|
|
><P
|
|
></P
|
|
><TABLE
|
|
CLASS="NOTE"
|
|
WIDTH="90%"
|
|
BORDER="0"
|
|
><TR
|
|
><TD
|
|
WIDTH="25"
|
|
ALIGN="CENTER"
|
|
VALIGN="TOP"
|
|
><IMG
|
|
SRC="../images/note.gif"
|
|
HSPACE="5"
|
|
ALT="Note"></TD
|
|
><TD
|
|
ALIGN="LEFT"
|
|
VALIGN="TOP"
|
|
><P
|
|
>See also <A
|
|
HREF="contributed-scripts.html#QKY"
|
|
>Example A-41</A
|
|
> for an example
|
|
of speedy <I
|
|
CLASS="FIRSTTERM"
|
|
>fgrep</I
|
|
> lookup on a large
|
|
text file.</P
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
></DIV
|
|
><P
|
|
><A
|
|
NAME="AGREPREF"
|
|
></A
|
|
></P
|
|
><P
|
|
><B
|
|
CLASS="COMMAND"
|
|
>agrep</B
|
|
> (<I
|
|
CLASS="FIRSTTERM"
|
|
>approximate
|
|
grep</I
|
|
>) extends the capabilities of
|
|
<B
|
|
CLASS="COMMAND"
|
|
>grep</B
|
|
> to approximate matching. The search
|
|
string may differ by a specified number of characters
|
|
from the resulting matches. This utility is not part of
|
|
the core Linux distribution.</P
|
|
><P
|
|
><A
|
|
NAME="ZEGREPREF"
|
|
></A
|
|
></P
|
|
><DIV
|
|
CLASS="TIP"
|
|
><P
|
|
></P
|
|
><TABLE
|
|
CLASS="TIP"
|
|
WIDTH="90%"
|
|
BORDER="0"
|
|
><TR
|
|
><TD
|
|
WIDTH="25"
|
|
ALIGN="CENTER"
|
|
VALIGN="TOP"
|
|
><IMG
|
|
SRC="../images/tip.gif"
|
|
HSPACE="5"
|
|
ALT="Tip"></TD
|
|
><TD
|
|
ALIGN="LEFT"
|
|
VALIGN="TOP"
|
|
><P
|
|
>To search compressed files, use
|
|
<B
|
|
CLASS="COMMAND"
|
|
>zgrep</B
|
|
>, <B
|
|
CLASS="COMMAND"
|
|
>zegrep</B
|
|
>, or
|
|
<B
|
|
CLASS="COMMAND"
|
|
>zfgrep</B
|
|
>. These also work on non-compressed
|
|
files, though slower than plain <B
|
|
CLASS="COMMAND"
|
|
>grep</B
|
|
>,
|
|
<B
|
|
CLASS="COMMAND"
|
|
>egrep</B
|
|
>, <B
|
|
CLASS="COMMAND"
|
|
>fgrep</B
|
|
>.
|
|
They are handy for searching through a mixed set of files,
|
|
some compressed, some not.</P
|
|
><P
|
|
><A
|
|
NAME="BZGREPREF"
|
|
></A
|
|
></P
|
|
><P
|
|
>To search <A
|
|
HREF="filearchiv.html#BZIPREF"
|
|
>bzipped</A
|
|
>
|
|
files, use <B
|
|
CLASS="COMMAND"
|
|
>bzgrep</B
|
|
>.</P
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
></DIV
|
|
></DD
|
|
><DT
|
|
><A
|
|
NAME="LOOKREF"
|
|
></A
|
|
><B
|
|
CLASS="COMMAND"
|
|
>look</B
|
|
></DT
|
|
><DD
|
|
><P
|
|
>The command <B
|
|
CLASS="COMMAND"
|
|
>look</B
|
|
> works like
|
|
<B
|
|
CLASS="COMMAND"
|
|
>grep</B
|
|
>, but does a lookup on
|
|
a <SPAN
|
|
CLASS="QUOTE"
|
|
>"dictionary,"</SPAN
|
|
> a sorted word list.
|
|
By default, <B
|
|
CLASS="COMMAND"
|
|
>look</B
|
|
> searches for a match
|
|
in <TT
|
|
CLASS="FILENAME"
|
|
>/usr/dict/words</TT
|
|
>, but a different
|
|
dictionary file may be specified.</P
|
|
><DIV
|
|
CLASS="EXAMPLE"
|
|
><A
|
|
NAME="LOOKUP"
|
|
></A
|
|
><P
|
|
><B
|
|
>Example 16-20. Checking words in a list for validity</B
|
|
></P
|
|
><TABLE
|
|
BORDER="0"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="PROGRAMLISTING"
|
|
>#!/bin/bash
|
|
# lookup: Does a dictionary lookup on each word in a data file.
|
|
|
|
file=words.data # Data file from which to read words to test.
|
|
|
|
echo
|
|
echo "Testing file $file"
|
|
echo
|
|
|
|
while [ "$word" != end ] # Last word in data file.
|
|
do # ^^^
|
|
read word # From data file, because of redirection at end of loop.
|
|
look $word > /dev/null # Don't want to display lines in dictionary file.
|
|
# Searches for words in the file /usr/share/dict/words
|
|
#+ (usually a link to linux.words).
|
|
lookup=$? # Exit status of 'look' command.
|
|
|
|
if [ "$lookup" -eq 0 ]
|
|
then
|
|
echo "\"$word\" is valid."
|
|
else
|
|
echo "\"$word\" is invalid."
|
|
fi
|
|
|
|
done <"$file" # Redirects stdin to $file, so "reads" come from there.
|
|
|
|
echo
|
|
|
|
exit 0
|
|
|
|
# ----------------------------------------------------------------
|
|
# Code below line will not execute because of "exit" command above.
|
|
|
|
|
|
# Stephane Chazelas proposes the following, more concise alternative:
|
|
|
|
while read word && [[ $word != end ]]
|
|
do if look "$word" > /dev/null
|
|
then echo "\"$word\" is valid."
|
|
else echo "\"$word\" is invalid."
|
|
fi
|
|
done <"$file"
|
|
|
|
exit 0</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
></DIV
|
|
></DD
|
|
><DT
|
|
><B
|
|
CLASS="COMMAND"
|
|
>sed</B
|
|
>, <B
|
|
CLASS="COMMAND"
|
|
>awk</B
|
|
></DT
|
|
><DD
|
|
><P
|
|
>Scripting languages especially suited for parsing text
|
|
files and command output. May be embedded singly or in
|
|
combination in pipes and shell scripts.</P
|
|
></DD
|
|
><DT
|
|
><B
|
|
CLASS="COMMAND"
|
|
><A
|
|
HREF="sedawk.html#SEDREF"
|
|
>sed</A
|
|
></B
|
|
></DT
|
|
><DD
|
|
><P
|
|
>Non-interactive <SPAN
|
|
CLASS="QUOTE"
|
|
>"stream editor"</SPAN
|
|
>, permits using
|
|
many <B
|
|
CLASS="COMMAND"
|
|
>ex</B
|
|
> commands in <A
|
|
HREF="timedate.html#BATCHPROCREF"
|
|
>batch</A
|
|
> mode. It finds many
|
|
uses in shell scripts.</P
|
|
></DD
|
|
><DT
|
|
><B
|
|
CLASS="COMMAND"
|
|
><A
|
|
HREF="awk.html#AWKREF"
|
|
>awk</A
|
|
></B
|
|
></DT
|
|
><DD
|
|
><P
|
|
>Programmable file extractor and formatter, good for
|
|
manipulating and/or extracting <A
|
|
HREF="special-chars.html#FIELDREF"
|
|
>fields</A
|
|
> (columns) in structured
|
|
text files. Its syntax is similar to C.</P
|
|
></DD
|
|
><DT
|
|
><A
|
|
NAME="WCREF"
|
|
></A
|
|
><B
|
|
CLASS="COMMAND"
|
|
>wc</B
|
|
></DT
|
|
><DD
|
|
><P
|
|
><I
|
|
CLASS="FIRSTTERM"
|
|
>wc</I
|
|
> gives a <SPAN
|
|
CLASS="QUOTE"
|
|
>"word
|
|
count"</SPAN
|
|
> on a file or I/O stream:
|
|
|
|
<TABLE
|
|
BORDER="1"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="SCREEN"
|
|
><TT
|
|
CLASS="PROMPT"
|
|
>bash $ </TT
|
|
><TT
|
|
CLASS="USERINPUT"
|
|
><B
|
|
>wc /usr/share/doc/sed-4.1.2/README</B
|
|
></TT
|
|
>
|
|
<TT
|
|
CLASS="COMPUTEROUTPUT"
|
|
>13 70 447 README</TT
|
|
>
|
|
[13 lines 70 words 447 characters]</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
></P
|
|
><P
|
|
><TT
|
|
CLASS="USERINPUT"
|
|
><B
|
|
>wc -w</B
|
|
></TT
|
|
> gives only the word count.</P
|
|
><P
|
|
><TT
|
|
CLASS="USERINPUT"
|
|
><B
|
|
>wc -l</B
|
|
></TT
|
|
> gives only the line count.</P
|
|
><P
|
|
><TT
|
|
CLASS="USERINPUT"
|
|
><B
|
|
>wc -c</B
|
|
></TT
|
|
> gives only the byte count.</P
|
|
><P
|
|
><TT
|
|
CLASS="USERINPUT"
|
|
><B
|
|
>wc -m</B
|
|
></TT
|
|
> gives only the character count.</P
|
|
><P
|
|
><TT
|
|
CLASS="USERINPUT"
|
|
><B
|
|
>wc -L</B
|
|
></TT
|
|
> gives only the length of the longest line.</P
|
|
><P
|
|
>Using <B
|
|
CLASS="COMMAND"
|
|
>wc</B
|
|
> to count how many
|
|
<TT
|
|
CLASS="FILENAME"
|
|
>.txt</TT
|
|
> files are in current working directory:
|
|
<TABLE
|
|
BORDER="0"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="PROGRAMLISTING"
|
|
>$ ls *.txt | wc -l
|
|
# Will work as long as none of the "*.txt" files
|
|
#+ have a linefeed embedded in their name.
|
|
|
|
# Alternative ways of doing this are:
|
|
# find . -maxdepth 1 -name \*.txt -print0 | grep -cz .
|
|
# (shopt -s nullglob; set -- *.txt; echo $#)
|
|
|
|
# Thanks, S.C.</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
>
|
|
</P
|
|
><P
|
|
>Using <B
|
|
CLASS="COMMAND"
|
|
>wc</B
|
|
> to total up the size of all the
|
|
files whose names begin with letters in the range d - h
|
|
<TABLE
|
|
BORDER="1"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="SCREEN"
|
|
><TT
|
|
CLASS="PROMPT"
|
|
>bash$ </TT
|
|
><TT
|
|
CLASS="USERINPUT"
|
|
><B
|
|
>wc [d-h]* | grep total | awk '{print $3}'</B
|
|
></TT
|
|
>
|
|
<TT
|
|
CLASS="COMPUTEROUTPUT"
|
|
>71832</TT
|
|
>
|
|
</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
>
|
|
</P
|
|
><P
|
|
>Using <B
|
|
CLASS="COMMAND"
|
|
>wc</B
|
|
> to count the instances of the
|
|
word <SPAN
|
|
CLASS="QUOTE"
|
|
>"Linux"</SPAN
|
|
> in the main source file for
|
|
this book.
|
|
<TABLE
|
|
BORDER="1"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="SCREEN"
|
|
><TT
|
|
CLASS="PROMPT"
|
|
>bash$ </TT
|
|
><TT
|
|
CLASS="USERINPUT"
|
|
><B
|
|
>grep Linux abs-book.sgml | wc -l</B
|
|
></TT
|
|
>
|
|
<TT
|
|
CLASS="COMPUTEROUTPUT"
|
|
>138</TT
|
|
>
|
|
</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
>
|
|
</P
|
|
><P
|
|
>See also <A
|
|
HREF="filearchiv.html#EX52"
|
|
>Example 16-39</A
|
|
> and <A
|
|
HREF="redircb.html#REDIR4"
|
|
>Example 20-8</A
|
|
>.</P
|
|
><P
|
|
>Certain commands include some of the
|
|
functionality of <B
|
|
CLASS="COMMAND"
|
|
>wc</B
|
|
> as options.
|
|
|
|
<TABLE
|
|
BORDER="0"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="PROGRAMLISTING"
|
|
>... | grep foo | wc -l
|
|
# This frequently used construct can be more concisely rendered.
|
|
|
|
... | grep -c foo
|
|
# Just use the "-c" (or "--count") option of grep.
|
|
|
|
# Thanks, S.C.</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
></P
|
|
></DD
|
|
><DT
|
|
><A
|
|
NAME="TRREF"
|
|
></A
|
|
><B
|
|
CLASS="COMMAND"
|
|
>tr</B
|
|
></DT
|
|
><DD
|
|
><P
|
|
>character translation filter.</P
|
|
><DIV
|
|
CLASS="CAUTION"
|
|
><P
|
|
></P
|
|
><TABLE
|
|
CLASS="CAUTION"
|
|
WIDTH="90%"
|
|
BORDER="0"
|
|
><TR
|
|
><TD
|
|
WIDTH="25"
|
|
ALIGN="CENTER"
|
|
VALIGN="TOP"
|
|
><IMG
|
|
SRC="../images/caution.gif"
|
|
HSPACE="5"
|
|
ALT="Caution"></TD
|
|
><TD
|
|
ALIGN="LEFT"
|
|
VALIGN="TOP"
|
|
><P
|
|
><A
|
|
HREF="special-chars.html#UCREF"
|
|
>Must use quoting and/or
|
|
brackets</A
|
|
>, as appropriate. Quotes prevent the
|
|
shell from reinterpreting the special characters in
|
|
<B
|
|
CLASS="COMMAND"
|
|
>tr</B
|
|
> command sequences. Brackets should be
|
|
quoted to prevent expansion by the shell. </P
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
></DIV
|
|
><P
|
|
>Either <TT
|
|
CLASS="USERINPUT"
|
|
><B
|
|
>tr "A-Z" "*" <filename</B
|
|
></TT
|
|
>
|
|
or <TT
|
|
CLASS="USERINPUT"
|
|
><B
|
|
>tr A-Z \* <filename</B
|
|
></TT
|
|
> changes
|
|
all the uppercase letters in <TT
|
|
CLASS="FILENAME"
|
|
>filename</TT
|
|
>
|
|
to asterisks (writes to <TT
|
|
CLASS="FILENAME"
|
|
>stdout</TT
|
|
>).
|
|
On some systems this may not work, but <TT
|
|
CLASS="USERINPUT"
|
|
><B
|
|
>tr A-Z
|
|
'[**]'</B
|
|
></TT
|
|
> will.</P
|
|
><P
|
|
><A
|
|
NAME="TROPTIONS"
|
|
></A
|
|
></P
|
|
><P
|
|
>The <TT
|
|
CLASS="OPTION"
|
|
>-d</TT
|
|
> option deletes a range of
|
|
characters.
|
|
<TABLE
|
|
BORDER="0"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="PROGRAMLISTING"
|
|
>echo "abcdef" # abcdef
|
|
echo "abcdef" | tr -d b-d # aef
|
|
|
|
|
|
tr -d 0-9 <filename
|
|
# Deletes all digits from the file "filename".</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
></P
|
|
><P
|
|
>The <TT
|
|
CLASS="OPTION"
|
|
>--squeeze-repeats</TT
|
|
> (or
|
|
<TT
|
|
CLASS="OPTION"
|
|
>-s</TT
|
|
>) option deletes all but the
|
|
first instance of a string of consecutive characters.
|
|
This option is useful for removing excess <A
|
|
HREF="special-chars.html#WHITESPACEREF"
|
|
>whitespace</A
|
|
>.
|
|
|
|
|
|
|
|
<TABLE
|
|
BORDER="1"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="SCREEN"
|
|
><TT
|
|
CLASS="PROMPT"
|
|
>bash$ </TT
|
|
><TT
|
|
CLASS="USERINPUT"
|
|
><B
|
|
>echo "XXXXX" | tr --squeeze-repeats 'X'</B
|
|
></TT
|
|
>
|
|
<TT
|
|
CLASS="COMPUTEROUTPUT"
|
|
>X</TT
|
|
></PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
></P
|
|
><P
|
|
>The <TT
|
|
CLASS="OPTION"
|
|
>-c</TT
|
|
> <SPAN
|
|
CLASS="QUOTE"
|
|
>"complement"</SPAN
|
|
>
|
|
option <I
|
|
CLASS="FIRSTTERM"
|
|
>inverts</I
|
|
> the character set to
|
|
match. With this option, <B
|
|
CLASS="COMMAND"
|
|
>tr</B
|
|
> acts only
|
|
upon those characters <EM
|
|
>not</EM
|
|
> matching
|
|
the specified set.</P
|
|
><P
|
|
> <TABLE
|
|
BORDER="1"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="SCREEN"
|
|
><TT
|
|
CLASS="PROMPT"
|
|
>bash$ </TT
|
|
><TT
|
|
CLASS="USERINPUT"
|
|
><B
|
|
>echo "acfdeb123" | tr -c b-d +</B
|
|
></TT
|
|
>
|
|
<TT
|
|
CLASS="COMPUTEROUTPUT"
|
|
>+c+d+b++++</TT
|
|
></PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
>
|
|
</P
|
|
><P
|
|
>Note that <B
|
|
CLASS="COMMAND"
|
|
>tr</B
|
|
> recognizes <A
|
|
HREF="x17129.html#POSIXREF"
|
|
>POSIX character classes</A
|
|
>.
|
|
<A
|
|
NAME="AEN11502"
|
|
HREF="#FTN.AEN11502"
|
|
><SPAN
|
|
CLASS="footnote"
|
|
>[1]</SPAN
|
|
></A
|
|
>
|
|
</P
|
|
><P
|
|
> <TABLE
|
|
BORDER="1"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="SCREEN"
|
|
><TT
|
|
CLASS="PROMPT"
|
|
>bash$ </TT
|
|
><TT
|
|
CLASS="USERINPUT"
|
|
><B
|
|
>echo "abcd2ef1" | tr '[:alpha:]' -</B
|
|
></TT
|
|
>
|
|
<TT
|
|
CLASS="COMPUTEROUTPUT"
|
|
>----2--1</TT
|
|
>
|
|
</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
>
|
|
</P
|
|
><DIV
|
|
CLASS="EXAMPLE"
|
|
><A
|
|
NAME="EX49"
|
|
></A
|
|
><P
|
|
><B
|
|
>Example 16-21. <I
|
|
CLASS="FIRSTTERM"
|
|
>toupper</I
|
|
>: Transforms a file
|
|
to all uppercase.</B
|
|
></P
|
|
><TABLE
|
|
BORDER="0"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="PROGRAMLISTING"
|
|
>#!/bin/bash
|
|
# Changes a file to all uppercase.
|
|
|
|
E_BADARGS=85
|
|
|
|
if [ -z "$1" ] # Standard check for command-line arg.
|
|
then
|
|
echo "Usage: `basename $0` filename"
|
|
exit $E_BADARGS
|
|
fi
|
|
|
|
tr a-z A-Z <"$1"
|
|
|
|
# Same effect as above, but using POSIX character set notation:
|
|
# tr '[:lower:]' '[:upper:]' <"$1"
|
|
# Thanks, S.C.
|
|
|
|
# Or even . . .
|
|
# cat "$1" | tr a-z A-Z
|
|
# Or dozens of other ways . . .
|
|
|
|
exit 0
|
|
|
|
# Exercise:
|
|
# Rewrite this script to give the option of changing a file
|
|
#+ to *either* upper or lowercase.
|
|
# Hint: Use either the "case" or "select" command.</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
></DIV
|
|
><DIV
|
|
CLASS="EXAMPLE"
|
|
><A
|
|
NAME="LOWERCASE"
|
|
></A
|
|
><P
|
|
><B
|
|
>Example 16-22. <I
|
|
CLASS="FIRSTTERM"
|
|
>lowercase</I
|
|
>: Changes all
|
|
filenames in working directory to lowercase.</B
|
|
></P
|
|
><TABLE
|
|
BORDER="0"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="PROGRAMLISTING"
|
|
>#!/bin/bash
|
|
#
|
|
# Changes every filename in working directory to all lowercase.
|
|
#
|
|
# Inspired by a script of John Dubois,
|
|
#+ which was translated into Bash by Chet Ramey,
|
|
#+ and considerably simplified by the author of the ABS Guide.
|
|
|
|
|
|
for filename in * # Traverse all files in directory.
|
|
do
|
|
fname=`basename $filename`
|
|
n=`echo $fname | tr A-Z a-z` # Change name to lowercase.
|
|
if [ "$fname" != "$n" ] # Rename only files not already lowercase.
|
|
then
|
|
mv $fname $n
|
|
fi
|
|
done
|
|
|
|
exit $?
|
|
|
|
|
|
# Code below this line will not execute because of "exit".
|
|
#--------------------------------------------------------#
|
|
# To run it, delete script above line.
|
|
|
|
# The above script will not work on filenames containing blanks or newlines.
|
|
# Stephane Chazelas therefore suggests the following alternative:
|
|
|
|
|
|
for filename in * # Not necessary to use basename,
|
|
# since "*" won't return any file containing "/".
|
|
do n=`echo "$filename/" | tr '[:upper:]' '[:lower:]'`
|
|
# POSIX char set notation.
|
|
# Slash added so that trailing newlines are not
|
|
# removed by command substitution.
|
|
# Variable substitution:
|
|
n=${n%/} # Removes trailing slash, added above, from filename.
|
|
[[ $filename == $n ]] || mv "$filename" "$n"
|
|
# Checks if filename already lowercase.
|
|
done
|
|
|
|
exit $?</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
></DIV
|
|
><P
|
|
><A
|
|
NAME="TRD2U"
|
|
></A
|
|
></P
|
|
><DIV
|
|
CLASS="EXAMPLE"
|
|
><A
|
|
NAME="DU"
|
|
></A
|
|
><P
|
|
><B
|
|
>Example 16-23. <I
|
|
CLASS="FIRSTTERM"
|
|
>du</I
|
|
>: DOS to UNIX text file conversion.</B
|
|
></P
|
|
><TABLE
|
|
BORDER="0"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="PROGRAMLISTING"
|
|
>#!/bin/bash
|
|
# Du.sh: DOS to UNIX text file converter.
|
|
|
|
E_WRONGARGS=85
|
|
|
|
if [ -z "$1" ]
|
|
then
|
|
echo "Usage: `basename $0` filename-to-convert"
|
|
exit $E_WRONGARGS
|
|
fi
|
|
|
|
NEWFILENAME=$1.unx
|
|
|
|
CR='\015' # Carriage return.
|
|
# 015 is octal ASCII code for CR.
|
|
# Lines in a DOS text file end in CR-LF.
|
|
# Lines in a UNIX text file end in LF only.
|
|
|
|
tr -d $CR < $1 > $NEWFILENAME
|
|
# Delete CR's and write to new file.
|
|
|
|
echo "Original DOS text file is \"$1\"."
|
|
echo "Converted UNIX text file is \"$NEWFILENAME\"."
|
|
|
|
exit 0
|
|
|
|
# Exercise:
|
|
# --------
|
|
# Change the above script to convert from UNIX to DOS.</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
></DIV
|
|
><DIV
|
|
CLASS="EXAMPLE"
|
|
><A
|
|
NAME="ROT13"
|
|
></A
|
|
><P
|
|
><B
|
|
>Example 16-24. <I
|
|
CLASS="FIRSTTERM"
|
|
>rot13</I
|
|
>: ultra-weak encryption.</B
|
|
></P
|
|
><TABLE
|
|
BORDER="0"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="PROGRAMLISTING"
|
|
>#!/bin/bash
|
|
# rot13.sh: Classic rot13 algorithm,
|
|
# encryption that might fool a 3-year old
|
|
# for about 10 minutes.
|
|
|
|
# Usage: ./rot13.sh filename
|
|
# or ./rot13.sh <filename
|
|
# or ./rot13.sh and supply keyboard input (stdin)
|
|
|
|
cat "$@" | tr 'a-zA-Z' 'n-za-mN-ZA-M' # "a" goes to "n", "b" to "o" ...
|
|
# The cat "$@" construct
|
|
#+ permits input either from stdin or from files.
|
|
|
|
exit 0</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
></DIV
|
|
><DIV
|
|
CLASS="EXAMPLE"
|
|
><A
|
|
NAME="CRYPTOQUOTE"
|
|
></A
|
|
><P
|
|
><B
|
|
>Example 16-25. Generating <SPAN
|
|
CLASS="QUOTE"
|
|
>"Crypto-Quote"</SPAN
|
|
> Puzzles</B
|
|
></P
|
|
><TABLE
|
|
BORDER="0"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="PROGRAMLISTING"
|
|
>#!/bin/bash
|
|
# crypto-quote.sh: Encrypt quotes
|
|
|
|
# Will encrypt famous quotes in a simple monoalphabetic substitution.
|
|
# The result is similar to the "Crypto Quote" puzzles
|
|
#+ seen in the Op Ed pages of the Sunday paper.
|
|
|
|
|
|
key=ETAOINSHRDLUBCFGJMQPVWZYXK
|
|
# The "key" is nothing more than a scrambled alphabet.
|
|
# Changing the "key" changes the encryption.
|
|
|
|
# The 'cat "$@"' construction gets input either from stdin or from files.
|
|
# If using stdin, terminate input with a Control-D.
|
|
# Otherwise, specify filename as command-line parameter.
|
|
|
|
cat "$@" | tr "a-z" "A-Z" | tr "A-Z" "$key"
|
|
# | to uppercase | encrypt
|
|
# Will work on lowercase, uppercase, or mixed-case quotes.
|
|
# Passes non-alphabetic characters through unchanged.
|
|
|
|
|
|
# Try this script with something like:
|
|
# "Nothing so needs reforming as other people's habits."
|
|
# --Mark Twain
|
|
#
|
|
# Output is:
|
|
# "CFPHRCS QF CIIOQ MINFMBRCS EQ FPHIM GIFGUI'Q HETRPQ."
|
|
# --BEML PZERC
|
|
|
|
# To reverse the encryption:
|
|
# cat "$@" | tr "$key" "A-Z"
|
|
|
|
|
|
# This simple-minded cipher can be broken by an average 12-year old
|
|
#+ using only pencil and paper.
|
|
|
|
exit 0
|
|
|
|
# Exercise:
|
|
# --------
|
|
# Modify the script so that it will either encrypt or decrypt,
|
|
#+ depending on command-line argument(s).</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
></DIV
|
|
><P
|
|
><A
|
|
NAME="JABH"
|
|
></A
|
|
>Of course, <I
|
|
CLASS="FIRSTTERM"
|
|
>tr</I
|
|
>
|
|
lends itself to <I
|
|
CLASS="FIRSTTERM"
|
|
>code
|
|
obfuscation</I
|
|
>.</P
|
|
><P
|
|
><TABLE
|
|
BORDER="0"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="PROGRAMLISTING"
|
|
>#!/bin/bash
|
|
# jabh.sh
|
|
|
|
x="wftedskaebjgdBstbdbsmnjgz"
|
|
echo $x | tr "a-z" 'oh, turtleneck Phrase Jar!'
|
|
|
|
# Based on the Wikipedia "Just another Perl hacker" article.</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
></P
|
|
><P
|
|
><A
|
|
NAME="TRVARIANTS"
|
|
></A
|
|
></P
|
|
><TABLE
|
|
CLASS="SIDEBAR"
|
|
BORDER="1"
|
|
CELLPADDING="5"
|
|
><TR
|
|
><TD
|
|
><DIV
|
|
CLASS="SIDEBAR"
|
|
><A
|
|
NAME="AEN11540"
|
|
></A
|
|
><P
|
|
><B
|
|
><I
|
|
CLASS="FIRSTTERM"
|
|
>tr</I
|
|
> variants</B
|
|
></P
|
|
><P
|
|
> The <B
|
|
CLASS="COMMAND"
|
|
>tr</B
|
|
> utility has two historic
|
|
variants. The BSD version does not use brackets
|
|
(<TT
|
|
CLASS="USERINPUT"
|
|
><B
|
|
>tr a-z A-Z</B
|
|
></TT
|
|
>), but the SysV one does
|
|
(<TT
|
|
CLASS="USERINPUT"
|
|
><B
|
|
>tr '[a-z]' '[A-Z]'</B
|
|
></TT
|
|
>). The GNU version
|
|
of <B
|
|
CLASS="COMMAND"
|
|
>tr</B
|
|
> resembles the BSD one.
|
|
</P
|
|
></DIV
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
></DD
|
|
><DT
|
|
><A
|
|
NAME="FOLDREF"
|
|
></A
|
|
><B
|
|
CLASS="COMMAND"
|
|
>fold</B
|
|
></DT
|
|
><DD
|
|
><P
|
|
>A filter that wraps lines of input to a specified width.
|
|
This is especially useful with the <TT
|
|
CLASS="OPTION"
|
|
>-s</TT
|
|
>
|
|
option, which breaks lines at word spaces (see <A
|
|
HREF="textproc.html#EX50"
|
|
>Example 16-26</A
|
|
> and <A
|
|
HREF="contributed-scripts.html#MAILFORMAT"
|
|
>Example A-1</A
|
|
>).</P
|
|
></DD
|
|
><DT
|
|
><A
|
|
NAME="FMTREF"
|
|
></A
|
|
><B
|
|
CLASS="COMMAND"
|
|
>fmt</B
|
|
></DT
|
|
><DD
|
|
><P
|
|
>Simple-minded file formatter, used as a filter in a
|
|
pipe to <SPAN
|
|
CLASS="QUOTE"
|
|
>"wrap"</SPAN
|
|
> long lines of text
|
|
output.</P
|
|
><DIV
|
|
CLASS="EXAMPLE"
|
|
><A
|
|
NAME="EX50"
|
|
></A
|
|
><P
|
|
><B
|
|
>Example 16-26. Formatted file listing.</B
|
|
></P
|
|
><TABLE
|
|
BORDER="0"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="PROGRAMLISTING"
|
|
>#!/bin/bash
|
|
|
|
WIDTH=40 # 40 columns wide.
|
|
|
|
b=`ls /usr/local/bin` # Get a file listing...
|
|
|
|
echo $b | fmt -w $WIDTH
|
|
|
|
# Could also have been done by
|
|
# echo $b | fold - -s -w $WIDTH
|
|
|
|
exit 0</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
></DIV
|
|
><P
|
|
>See also <A
|
|
HREF="moreadv.html#EX41"
|
|
>Example 16-5</A
|
|
>.</P
|
|
><DIV
|
|
CLASS="TIP"
|
|
><P
|
|
></P
|
|
><TABLE
|
|
CLASS="TIP"
|
|
WIDTH="90%"
|
|
BORDER="0"
|
|
><TR
|
|
><TD
|
|
WIDTH="25"
|
|
ALIGN="CENTER"
|
|
VALIGN="TOP"
|
|
><IMG
|
|
SRC="../images/tip.gif"
|
|
HSPACE="5"
|
|
ALT="Tip"></TD
|
|
><TD
|
|
ALIGN="LEFT"
|
|
VALIGN="TOP"
|
|
><P
|
|
>A powerful alternative to <B
|
|
CLASS="COMMAND"
|
|
>fmt</B
|
|
> is
|
|
Kamil Toman's <B
|
|
CLASS="COMMAND"
|
|
>par</B
|
|
>
|
|
utility, available from <A
|
|
HREF="http://www.cs.berkeley.edu/~amc/Par/"
|
|
TARGET="_top"
|
|
>http://www.cs.berkeley.edu/~amc/Par/</A
|
|
>.
|
|
</P
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
></DIV
|
|
></DD
|
|
><DT
|
|
><A
|
|
NAME="COLREF"
|
|
></A
|
|
><B
|
|
CLASS="COMMAND"
|
|
>col</B
|
|
></DT
|
|
><DD
|
|
><P
|
|
>This deceptively named filter removes reverse line feeds
|
|
from an input stream. It also attempts to replace
|
|
whitespace with equivalent tabs. The chief use of
|
|
<B
|
|
CLASS="COMMAND"
|
|
>col</B
|
|
> is in filtering the output
|
|
from certain text processing utilities, such as
|
|
<B
|
|
CLASS="COMMAND"
|
|
>groff</B
|
|
> and <B
|
|
CLASS="COMMAND"
|
|
>tbl</B
|
|
>.</P
|
|
></DD
|
|
><DT
|
|
><A
|
|
NAME="COLUMNREF"
|
|
></A
|
|
><B
|
|
CLASS="COMMAND"
|
|
>column</B
|
|
></DT
|
|
><DD
|
|
><P
|
|
>Column formatter. This filter transforms list-type
|
|
text output into a <SPAN
|
|
CLASS="QUOTE"
|
|
>"pretty-printed"</SPAN
|
|
> table
|
|
by inserting tabs at appropriate places.</P
|
|
><DIV
|
|
CLASS="EXAMPLE"
|
|
><A
|
|
NAME="COL"
|
|
></A
|
|
><P
|
|
><B
|
|
>Example 16-27. Using <I
|
|
CLASS="FIRSTTERM"
|
|
>column</I
|
|
> to format a directory
|
|
listing</B
|
|
></P
|
|
><TABLE
|
|
BORDER="0"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="PROGRAMLISTING"
|
|
>#!/bin/bash
|
|
# colms.sh
|
|
# A minor modification of the example file in the "column" man page.
|
|
|
|
|
|
(printf "PERMISSIONS LINKS OWNER GROUP SIZE MONTH DAY HH:MM PROG-NAME\n" \
|
|
; ls -l | sed 1d) | column -t
|
|
# ^^^^^^ ^^
|
|
|
|
# The "sed 1d" in the pipe deletes the first line of output,
|
|
#+ which would be "total N",
|
|
#+ where "N" is the total number of files found by "ls -l".
|
|
|
|
# The -t option to "column" pretty-prints a table.
|
|
|
|
exit 0</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
></DIV
|
|
></DD
|
|
><DT
|
|
><A
|
|
NAME="COLRMREF"
|
|
></A
|
|
><B
|
|
CLASS="COMMAND"
|
|
>colrm</B
|
|
></DT
|
|
><DD
|
|
><P
|
|
>Column removal filter. This removes columns (characters)
|
|
from a file and writes the file, lacking the range of
|
|
specified columns, back to <TT
|
|
CLASS="FILENAME"
|
|
>stdout</TT
|
|
>.
|
|
<TT
|
|
CLASS="USERINPUT"
|
|
><B
|
|
>colrm 2 4 <filename</B
|
|
></TT
|
|
> removes the
|
|
second through fourth characters from each line of the
|
|
text file <TT
|
|
CLASS="FILENAME"
|
|
>filename</TT
|
|
>.</P
|
|
><DIV
|
|
CLASS="CAUTION"
|
|
><P
|
|
></P
|
|
><TABLE
|
|
CLASS="CAUTION"
|
|
WIDTH="90%"
|
|
BORDER="0"
|
|
><TR
|
|
><TD
|
|
WIDTH="25"
|
|
ALIGN="CENTER"
|
|
VALIGN="TOP"
|
|
><IMG
|
|
SRC="../images/caution.gif"
|
|
HSPACE="5"
|
|
ALT="Caution"></TD
|
|
><TD
|
|
ALIGN="LEFT"
|
|
VALIGN="TOP"
|
|
><P
|
|
>If the file contains tabs or nonprintable
|
|
characters, this may cause unpredictable
|
|
behavior. In such cases, consider using
|
|
<A
|
|
HREF="textproc.html#EXPANDREF"
|
|
>expand</A
|
|
> and
|
|
<B
|
|
CLASS="COMMAND"
|
|
>unexpand</B
|
|
> in a pipe preceding
|
|
<B
|
|
CLASS="COMMAND"
|
|
>colrm</B
|
|
>.</P
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
></DIV
|
|
></DD
|
|
><DT
|
|
><A
|
|
NAME="NLREF"
|
|
></A
|
|
><B
|
|
CLASS="COMMAND"
|
|
>nl</B
|
|
></DT
|
|
><DD
|
|
><P
|
|
>Line numbering filter: <TT
|
|
CLASS="USERINPUT"
|
|
><B
|
|
>nl filename</B
|
|
></TT
|
|
>
|
|
lists <TT
|
|
CLASS="FILENAME"
|
|
>filename</TT
|
|
> to
|
|
<TT
|
|
CLASS="FILENAME"
|
|
>stdout</TT
|
|
>, but inserts consecutive
|
|
numbers at the beginning of each non-blank line. If
|
|
<TT
|
|
CLASS="FILENAME"
|
|
>filename</TT
|
|
> omitted, operates on
|
|
<TT
|
|
CLASS="FILENAME"
|
|
>stdin.</TT
|
|
></P
|
|
><P
|
|
>The output of <B
|
|
CLASS="COMMAND"
|
|
>nl</B
|
|
> is very similar to
|
|
<TT
|
|
CLASS="USERINPUT"
|
|
><B
|
|
>cat -b</B
|
|
></TT
|
|
>, since, by default
|
|
<B
|
|
CLASS="COMMAND"
|
|
>nl</B
|
|
> does not list blank lines.</P
|
|
><DIV
|
|
CLASS="EXAMPLE"
|
|
><A
|
|
NAME="LNUM"
|
|
></A
|
|
><P
|
|
><B
|
|
>Example 16-28. <I
|
|
CLASS="FIRSTTERM"
|
|
>nl</I
|
|
>: A self-numbering script.</B
|
|
></P
|
|
><TABLE
|
|
BORDER="0"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="PROGRAMLISTING"
|
|
>#!/bin/bash
|
|
# line-number.sh
|
|
|
|
# This script echoes itself twice to stdout with its lines numbered.
|
|
|
|
echo " line number = $LINENO" # 'nl' sees this as line 4
|
|
# (nl does not number blank lines).
|
|
# 'cat -n' sees it correctly as line #6.
|
|
|
|
nl `basename $0`
|
|
|
|
echo; echo # Now, let's try it with 'cat -n'
|
|
|
|
cat -n `basename $0`
|
|
# The difference is that 'cat -n' numbers the blank lines.
|
|
# Note that 'nl -ba' will also do so.
|
|
|
|
exit 0
|
|
# -----------------------------------------------------------------</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
></DIV
|
|
></DD
|
|
><DT
|
|
><A
|
|
NAME="PRREF"
|
|
></A
|
|
><B
|
|
CLASS="COMMAND"
|
|
>pr</B
|
|
></DT
|
|
><DD
|
|
><P
|
|
>Print formatting filter. This will paginate files
|
|
(or <TT
|
|
CLASS="FILENAME"
|
|
>stdout</TT
|
|
>) into sections suitable for
|
|
hard copy printing or viewing on screen. Various options
|
|
permit row and column manipulation, joining lines, setting
|
|
margins, numbering lines, adding page headers, and merging
|
|
files, among other things. The <B
|
|
CLASS="COMMAND"
|
|
>pr</B
|
|
>
|
|
command combines much of the functionality of
|
|
<B
|
|
CLASS="COMMAND"
|
|
>nl</B
|
|
>, <B
|
|
CLASS="COMMAND"
|
|
>paste</B
|
|
>,
|
|
<B
|
|
CLASS="COMMAND"
|
|
>fold</B
|
|
>, <B
|
|
CLASS="COMMAND"
|
|
>column</B
|
|
>, and
|
|
<B
|
|
CLASS="COMMAND"
|
|
>expand</B
|
|
>.</P
|
|
><P
|
|
><TT
|
|
CLASS="USERINPUT"
|
|
><B
|
|
>pr -o 5 --width=65 fileZZZ | more</B
|
|
></TT
|
|
>
|
|
gives a nice paginated listing to screen of
|
|
<TT
|
|
CLASS="FILENAME"
|
|
>fileZZZ</TT
|
|
> with margins set at 5 and
|
|
65.</P
|
|
><P
|
|
>A particularly useful option is <TT
|
|
CLASS="OPTION"
|
|
>-d</TT
|
|
>,
|
|
forcing double-spacing (same effect as <B
|
|
CLASS="COMMAND"
|
|
>sed
|
|
-G</B
|
|
>).</P
|
|
></DD
|
|
><DT
|
|
><A
|
|
NAME="GETTEXTREF"
|
|
></A
|
|
><B
|
|
CLASS="COMMAND"
|
|
>gettext</B
|
|
></DT
|
|
><DD
|
|
><P
|
|
>The GNU <B
|
|
CLASS="COMMAND"
|
|
>gettext</B
|
|
> package is a set of
|
|
utilities for <A
|
|
HREF="localization.html"
|
|
>localizing</A
|
|
>
|
|
and translating the text output of programs into foreign
|
|
languages. While originally intended for C programs, it
|
|
now supports quite a number of programming and scripting
|
|
languages.</P
|
|
><P
|
|
>The <B
|
|
CLASS="COMMAND"
|
|
>gettext</B
|
|
>
|
|
<EM
|
|
>program</EM
|
|
> works on shell scripts. See
|
|
the <TT
|
|
CLASS="REPLACEABLE"
|
|
><I
|
|
>info page</I
|
|
></TT
|
|
>.</P
|
|
></DD
|
|
><DT
|
|
><A
|
|
NAME="MSGFMTREF"
|
|
></A
|
|
><B
|
|
CLASS="COMMAND"
|
|
>msgfmt</B
|
|
></DT
|
|
><DD
|
|
><P
|
|
>A program for generating binary
|
|
message catalogs. It is used for <A
|
|
HREF="localization.html"
|
|
>localization</A
|
|
>.</P
|
|
></DD
|
|
><DT
|
|
><A
|
|
NAME="ICONVREF"
|
|
></A
|
|
><B
|
|
CLASS="COMMAND"
|
|
>iconv</B
|
|
></DT
|
|
><DD
|
|
><P
|
|
>A utility for converting file(s) to a different encoding
|
|
(character set). Its chief use is for <A
|
|
HREF="localization.html"
|
|
>localization</A
|
|
>.</P
|
|
><P
|
|
> <TABLE
|
|
BORDER="0"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="PROGRAMLISTING"
|
|
># Convert a string from UTF-8 to UTF-16 and print to the BookList
|
|
function write_utf8_string {
|
|
STRING=$1
|
|
BOOKLIST=$2
|
|
echo -n "$STRING" | iconv -f UTF8 -t UTF16 | \
|
|
cut -b 3- | tr -d \\n >> "$BOOKLIST"
|
|
}
|
|
|
|
# From Peter Knowles' "booklistgen.sh" script
|
|
#+ for converting files to Sony Librie/PRS-50X format.
|
|
# (http://booklistgensh.peterknowles.com)</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
>
|
|
</P
|
|
></DD
|
|
><DT
|
|
><A
|
|
NAME="RECODEREF"
|
|
></A
|
|
><B
|
|
CLASS="COMMAND"
|
|
>recode</B
|
|
></DT
|
|
><DD
|
|
><P
|
|
>Consider this a fancier version of
|
|
<B
|
|
CLASS="COMMAND"
|
|
>iconv</B
|
|
>, above. This very versatile utility
|
|
for converting a file to a different encoding scheme.
|
|
Note that <I
|
|
CLASS="FIRSTTERM"
|
|
>recode</I
|
|
> is not part of the
|
|
standard Linux installation.</P
|
|
></DD
|
|
><DT
|
|
><A
|
|
NAME="TEXREF"
|
|
></A
|
|
><B
|
|
CLASS="COMMAND"
|
|
>TeX</B
|
|
>, <A
|
|
NAME="GSREF"
|
|
></A
|
|
><B
|
|
CLASS="COMMAND"
|
|
>gs</B
|
|
></DT
|
|
><DD
|
|
><P
|
|
><B
|
|
CLASS="COMMAND"
|
|
>TeX</B
|
|
> and <B
|
|
CLASS="COMMAND"
|
|
>Postscript</B
|
|
>
|
|
are text markup languages used for preparing copy for
|
|
printing or formatted video display.</P
|
|
><P
|
|
><B
|
|
CLASS="COMMAND"
|
|
>TeX</B
|
|
> is Donald Knuth's elaborate
|
|
typsetting system. It is often convenient to write a
|
|
shell script encapsulating all the options and arguments
|
|
passed to one of these markup languages.</P
|
|
><P
|
|
><I
|
|
CLASS="FIRSTTERM"
|
|
>Ghostscript</I
|
|
>
|
|
(<B
|
|
CLASS="COMMAND"
|
|
>gs</B
|
|
>) is a GPL-ed Postscript
|
|
interpreter.</P
|
|
></DD
|
|
><DT
|
|
><A
|
|
NAME="TEXEXECREF"
|
|
></A
|
|
><B
|
|
CLASS="COMMAND"
|
|
>texexec</B
|
|
></DT
|
|
><DD
|
|
><P
|
|
>Utility for processing <I
|
|
CLASS="FIRSTTERM"
|
|
>TeX</I
|
|
> and
|
|
<I
|
|
CLASS="FIRSTTERM"
|
|
>pdf</I
|
|
> files. Found in
|
|
<TT
|
|
CLASS="FILENAME"
|
|
>/usr/bin</TT
|
|
>
|
|
on many Linux distros, it is actually a <A
|
|
HREF="wrapper.html#SHWRAPPER"
|
|
>shell wrapper</A
|
|
> that
|
|
calls <A
|
|
HREF="wrapper.html#PERLREF"
|
|
>Perl</A
|
|
> to invoke
|
|
<I
|
|
CLASS="FIRSTTERM"
|
|
>Tex</I
|
|
>.</P
|
|
><P
|
|
> <TABLE
|
|
BORDER="0"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="PROGRAMLISTING"
|
|
>texexec --pdfarrange --result=Concatenated.pdf *pdf
|
|
|
|
# Concatenates all the pdf files in the current working directory
|
|
#+ into the merged file, Concatenated.pdf . . .
|
|
# (The --pdfarrange option repaginates a pdf file. See also --pdfcombine.)
|
|
# The above command-line could be parameterized and put into a shell script.</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
>
|
|
</P
|
|
></DD
|
|
><DT
|
|
><A
|
|
NAME="ENSCRIPTREF"
|
|
></A
|
|
><B
|
|
CLASS="COMMAND"
|
|
>enscript</B
|
|
></DT
|
|
><DD
|
|
><P
|
|
>Utility for converting plain text file to PostScript</P
|
|
><P
|
|
>For example, <B
|
|
CLASS="COMMAND"
|
|
>enscript filename.txt -p filename.ps</B
|
|
>
|
|
produces the PostScript output file
|
|
<TT
|
|
CLASS="FILENAME"
|
|
>filename.ps</TT
|
|
>.</P
|
|
></DD
|
|
><DT
|
|
><A
|
|
NAME="GROFFREF"
|
|
></A
|
|
><B
|
|
CLASS="COMMAND"
|
|
>groff</B
|
|
>, <A
|
|
NAME="TBLREF"
|
|
></A
|
|
><B
|
|
CLASS="COMMAND"
|
|
>tbl</B
|
|
>, <A
|
|
NAME="EQNREF"
|
|
></A
|
|
><B
|
|
CLASS="COMMAND"
|
|
>eqn</B
|
|
></DT
|
|
><DD
|
|
><P
|
|
>Yet another text markup and display formatting language
|
|
is <B
|
|
CLASS="COMMAND"
|
|
>groff</B
|
|
>. This is the enhanced GNU version
|
|
of the venerable UNIX <B
|
|
CLASS="COMMAND"
|
|
>roff/troff</B
|
|
> display
|
|
and typesetting package. <A
|
|
HREF="basic.html#MANREF"
|
|
>Manpages</A
|
|
>
|
|
use <B
|
|
CLASS="COMMAND"
|
|
>groff</B
|
|
>.</P
|
|
><P
|
|
>The <B
|
|
CLASS="COMMAND"
|
|
>tbl</B
|
|
> table processing utility
|
|
is considered part of <B
|
|
CLASS="COMMAND"
|
|
>groff</B
|
|
>, as its
|
|
function is to convert table markup into
|
|
<B
|
|
CLASS="COMMAND"
|
|
>groff</B
|
|
> commands.</P
|
|
><P
|
|
>The <B
|
|
CLASS="COMMAND"
|
|
>eqn</B
|
|
> equation processing utility
|
|
is likewise part of <B
|
|
CLASS="COMMAND"
|
|
>groff</B
|
|
>, and
|
|
its function is to convert equation markup into
|
|
<B
|
|
CLASS="COMMAND"
|
|
>groff</B
|
|
> commands.</P
|
|
><DIV
|
|
CLASS="EXAMPLE"
|
|
><A
|
|
NAME="MANVIEW"
|
|
></A
|
|
><P
|
|
><B
|
|
>Example 16-29. <I
|
|
CLASS="FIRSTTERM"
|
|
>manview</I
|
|
>: Viewing formatted manpages</B
|
|
></P
|
|
><TABLE
|
|
BORDER="0"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="90%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="PROGRAMLISTING"
|
|
>#!/bin/bash
|
|
# manview.sh: Formats the source of a man page for viewing.
|
|
|
|
# This script is useful when writing man page source.
|
|
# It lets you look at the intermediate results on the fly
|
|
#+ while working on it.
|
|
|
|
E_WRONGARGS=85
|
|
|
|
if [ -z "$1" ]
|
|
then
|
|
echo "Usage: `basename $0` filename"
|
|
exit $E_WRONGARGS
|
|
fi
|
|
|
|
# ---------------------------
|
|
groff -Tascii -man $1 | less
|
|
# From the man page for groff.
|
|
# ---------------------------
|
|
|
|
# If the man page includes tables and/or equations,
|
|
#+ then the above code will barf.
|
|
# The following line can handle such cases.
|
|
#
|
|
# gtbl < "$1" | geqn -Tlatin1 | groff -Tlatin1 -mtty-char -man
|
|
#
|
|
# Thanks, S.C.
|
|
|
|
exit $? # See also the "maned.sh" script.</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
></DIV
|
|
><P
|
|
>See also <A
|
|
HREF="contributed-scripts.html#MANED"
|
|
>Example A-39</A
|
|
>.</P
|
|
></DD
|
|
><DT
|
|
><A
|
|
NAME="LEXREF"
|
|
></A
|
|
><B
|
|
CLASS="COMMAND"
|
|
>lex</B
|
|
>, <A
|
|
NAME="YACCREF"
|
|
></A
|
|
><B
|
|
CLASS="COMMAND"
|
|
>yacc</B
|
|
></DT
|
|
><DD
|
|
><P
|
|
><A
|
|
NAME="FLEXREF"
|
|
></A
|
|
></P
|
|
><P
|
|
>The <B
|
|
CLASS="COMMAND"
|
|
>lex</B
|
|
> lexical analyzer produces
|
|
programs for pattern matching. This has been replaced
|
|
by the nonproprietary <B
|
|
CLASS="COMMAND"
|
|
>flex</B
|
|
> on Linux
|
|
systems.</P
|
|
><P
|
|
><A
|
|
NAME="BISONREF"
|
|
></A
|
|
></P
|
|
><P
|
|
>The <B
|
|
CLASS="COMMAND"
|
|
>yacc</B
|
|
> utility creates a
|
|
parser based on a set of specifications. This has been
|
|
replaced by the nonproprietary <B
|
|
CLASS="COMMAND"
|
|
>bison</B
|
|
>
|
|
on Linux systems.</P
|
|
></DD
|
|
></DL
|
|
></DIV
|
|
></DIV
|
|
><H3
|
|
CLASS="FOOTNOTES"
|
|
>Notes</H3
|
|
><TABLE
|
|
BORDER="0"
|
|
CLASS="FOOTNOTES"
|
|
WIDTH="100%"
|
|
><TR
|
|
><TD
|
|
ALIGN="LEFT"
|
|
VALIGN="TOP"
|
|
WIDTH="5%"
|
|
><A
|
|
NAME="FTN.AEN11502"
|
|
HREF="textproc.html#AEN11502"
|
|
><SPAN
|
|
CLASS="footnote"
|
|
>[1]</SPAN
|
|
></A
|
|
></TD
|
|
><TD
|
|
ALIGN="LEFT"
|
|
VALIGN="TOP"
|
|
WIDTH="95%"
|
|
><P
|
|
>This is only true of the GNU version of
|
|
<B
|
|
CLASS="COMMAND"
|
|
>tr</B
|
|
>, not the generic version often found on
|
|
commercial UNIX systems.</P
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
><DIV
|
|
CLASS="NAVFOOTER"
|
|
><HR
|
|
ALIGN="LEFT"
|
|
WIDTH="100%"><TABLE
|
|
SUMMARY="Footer navigation table"
|
|
WIDTH="100%"
|
|
BORDER="0"
|
|
CELLPADDING="0"
|
|
CELLSPACING="0"
|
|
><TR
|
|
><TD
|
|
WIDTH="33%"
|
|
ALIGN="left"
|
|
VALIGN="top"
|
|
><A
|
|
HREF="timedate.html"
|
|
ACCESSKEY="P"
|
|
>Prev</A
|
|
></TD
|
|
><TD
|
|
WIDTH="34%"
|
|
ALIGN="center"
|
|
VALIGN="top"
|
|
><A
|
|
HREF="index.html"
|
|
ACCESSKEY="H"
|
|
>Home</A
|
|
></TD
|
|
><TD
|
|
WIDTH="33%"
|
|
ALIGN="right"
|
|
VALIGN="top"
|
|
><A
|
|
HREF="filearchiv.html"
|
|
ACCESSKEY="N"
|
|
>Next</A
|
|
></TD
|
|
></TR
|
|
><TR
|
|
><TD
|
|
WIDTH="33%"
|
|
ALIGN="left"
|
|
VALIGN="top"
|
|
>Time / Date Commands</TD
|
|
><TD
|
|
WIDTH="34%"
|
|
ALIGN="center"
|
|
VALIGN="top"
|
|
><A
|
|
HREF="external.html"
|
|
ACCESSKEY="U"
|
|
>Up</A
|
|
></TD
|
|
><TD
|
|
WIDTH="33%"
|
|
ALIGN="right"
|
|
VALIGN="top"
|
|
>File and Archiving Commands</TD
|
|
></TR
|
|
></TABLE
|
|
></DIV
|
|
></BODY
|
|
></HTML
|
|
> |