588 lines
9.0 KiB
HTML
588 lines
9.0 KiB
HTML
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
|
|
<HTML
|
|
><HEAD
|
|
><TITLE
|
|
>Awk</TITLE
|
|
><META
|
|
NAME="GENERATOR"
|
|
CONTENT="Modular DocBook HTML Stylesheet Version 1.7"><LINK
|
|
REL="HOME"
|
|
TITLE="Advanced Bash-Scripting Guide"
|
|
HREF="index.html"><LINK
|
|
REL="UP"
|
|
TITLE="A Sed and Awk Micro-Primer"
|
|
HREF="sedawk.html"><LINK
|
|
REL="PREVIOUS"
|
|
TITLE="Sed"
|
|
HREF="x23170.html"><LINK
|
|
REL="NEXT"
|
|
TITLE="Parsing and Managing Pathnames"
|
|
HREF="pathmanagement.html"></HEAD
|
|
><BODY
|
|
CLASS="SECT1"
|
|
BGCOLOR="#FFFFFF"
|
|
TEXT="#000000"
|
|
LINK="#0000FF"
|
|
VLINK="#840084"
|
|
ALINK="#0000FF"
|
|
><DIV
|
|
CLASS="NAVHEADER"
|
|
><TABLE
|
|
SUMMARY="Header navigation table"
|
|
WIDTH="100%"
|
|
BORDER="0"
|
|
CELLPADDING="0"
|
|
CELLSPACING="0"
|
|
><TR
|
|
><TH
|
|
COLSPAN="3"
|
|
ALIGN="center"
|
|
>Advanced Bash-Scripting Guide: </TH
|
|
></TR
|
|
><TR
|
|
><TD
|
|
WIDTH="10%"
|
|
ALIGN="left"
|
|
VALIGN="bottom"
|
|
><A
|
|
HREF="x23170.html"
|
|
ACCESSKEY="P"
|
|
>Prev</A
|
|
></TD
|
|
><TD
|
|
WIDTH="80%"
|
|
ALIGN="center"
|
|
VALIGN="bottom"
|
|
>Appendix C. A Sed and Awk Micro-Primer</TD
|
|
><TD
|
|
WIDTH="10%"
|
|
ALIGN="right"
|
|
VALIGN="bottom"
|
|
><A
|
|
HREF="pathmanagement.html"
|
|
ACCESSKEY="N"
|
|
>Next</A
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
><HR
|
|
ALIGN="LEFT"
|
|
WIDTH="100%"></DIV
|
|
><DIV
|
|
CLASS="SECT1"
|
|
><H1
|
|
CLASS="SECT1"
|
|
><A
|
|
NAME="AWK"
|
|
></A
|
|
>C.2. Awk</H1
|
|
><P
|
|
><A
|
|
NAME="AWKREF"
|
|
></A
|
|
></P
|
|
><P
|
|
><I
|
|
CLASS="FIRSTTERM"
|
|
>Awk</I
|
|
>
|
|
<A
|
|
NAME="AEN23443"
|
|
HREF="#FTN.AEN23443"
|
|
><SPAN
|
|
CLASS="footnote"
|
|
>[1]</SPAN
|
|
></A
|
|
>
|
|
is a full-featured text processing language with a syntax
|
|
reminiscent of <I
|
|
CLASS="FIRSTTERM"
|
|
>C</I
|
|
>. While it possesses an
|
|
extensive set of operators and capabilities, we will cover only
|
|
a few of these here - the ones most useful in shell scripts.</P
|
|
><P
|
|
>Awk breaks each line of input passed to it into
|
|
<A
|
|
NAME="FIELDREF2"
|
|
></A
|
|
>
|
|
<A
|
|
HREF="special-chars.html#FIELDREF"
|
|
>fields</A
|
|
>. By default, a field
|
|
is a string of consecutive characters delimited by <A
|
|
HREF="special-chars.html#WHITESPACEREF"
|
|
>whitespace</A
|
|
>, though there are options
|
|
for changing this. Awk parses and operates on each separate
|
|
field. This makes it ideal for handling structured text files
|
|
-- especially tables -- data organized into consistent chunks,
|
|
such as rows and columns.</P
|
|
><P
|
|
><A
|
|
HREF="varsubn.html#SNGLQUO"
|
|
>Strong quoting</A
|
|
> and <A
|
|
HREF="special-chars.html#CODEBLOCKREF"
|
|
>curly brackets</A
|
|
> enclose blocks of
|
|
awk code within a shell script.</P
|
|
><P
|
|
><TABLE
|
|
BORDER="0"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="100%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="PROGRAMLISTING"
|
|
># $1 is field #1, $2 is field #2, etc.
|
|
|
|
echo one two | awk '{print $1}'
|
|
# one
|
|
|
|
echo one two | awk '{print $2}'
|
|
# two
|
|
|
|
# But what is field #0 ($0)?
|
|
echo one two | awk '{print $0}'
|
|
# one two
|
|
# All the fields!
|
|
|
|
|
|
awk '{print $3}' $filename
|
|
# Prints field #3 of file $filename to stdout.
|
|
|
|
awk '{print $1 $5 $6}' $filename
|
|
# Prints fields #1, #5, and #6 of file $filename.
|
|
|
|
awk '{print $0}' $filename
|
|
# Prints the entire file!
|
|
# Same effect as: cat $filename . . . or . . . sed '' $filename</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
></P
|
|
><P
|
|
>We have just seen the awk <I
|
|
CLASS="FIRSTTERM"
|
|
>print</I
|
|
> command
|
|
in action. The only other feature of awk we need to deal with
|
|
here is variables. Awk handles variables similarly to shell
|
|
scripts, though a bit more flexibly.</P
|
|
><P
|
|
><TABLE
|
|
BORDER="0"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="100%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="PROGRAMLISTING"
|
|
>{ total += ${column_number} }</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
>
|
|
This adds the value of <TT
|
|
CLASS="PARAMETER"
|
|
><I
|
|
>column_number</I
|
|
></TT
|
|
> to
|
|
the running total of <TT
|
|
CLASS="PARAMETER"
|
|
><I
|
|
>total</I
|
|
></TT
|
|
>>. Finally, to print
|
|
<SPAN
|
|
CLASS="QUOTE"
|
|
>"total"</SPAN
|
|
>, there is an <B
|
|
CLASS="COMMAND"
|
|
>END</B
|
|
> command
|
|
block, executed after the script has processed all its input.
|
|
<TABLE
|
|
BORDER="0"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="100%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="PROGRAMLISTING"
|
|
>END { print total }</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
></P
|
|
><P
|
|
>Corresponding to the <B
|
|
CLASS="COMMAND"
|
|
>END</B
|
|
>, there is a
|
|
<B
|
|
CLASS="COMMAND"
|
|
>BEGIN</B
|
|
>, for a code block to be performed before awk
|
|
starts processing its input.</P
|
|
><P
|
|
>The following example illustrates how <B
|
|
CLASS="COMMAND"
|
|
>awk</B
|
|
> can
|
|
add text-parsing tools to a shell script.</P
|
|
><DIV
|
|
CLASS="EXAMPLE"
|
|
><A
|
|
NAME="LETTERCOUNT2"
|
|
></A
|
|
><P
|
|
><B
|
|
>Example C-1. Counting Letter Occurrences</B
|
|
></P
|
|
><TABLE
|
|
BORDER="0"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="100%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="PROGRAMLISTING"
|
|
>#! /bin/sh
|
|
# letter-count2.sh: Counting letter occurrences in a text file.
|
|
#
|
|
# Script by nyal [nyal@voila.fr].
|
|
# Used in ABS Guide with permission.
|
|
# Recommented and reformatted by ABS Guide author.
|
|
# Version 1.1: Modified to work with gawk 3.1.3.
|
|
# (Will still work with earlier versions.)
|
|
|
|
|
|
INIT_TAB_AWK=""
|
|
# Parameter to initialize awk script.
|
|
count_case=0
|
|
FILE_PARSE=$1
|
|
|
|
E_PARAMERR=85
|
|
|
|
usage()
|
|
{
|
|
echo "Usage: letter-count.sh file letters" 2>&1
|
|
# For example: ./letter-count2.sh filename.txt a b c
|
|
exit $E_PARAMERR # Too few arguments passed to script.
|
|
}
|
|
|
|
if [ ! -f "$1" ] ; then
|
|
echo "$1: No such file." 2>&1
|
|
usage # Print usage message and exit.
|
|
fi
|
|
|
|
if [ -z "$2" ] ; then
|
|
echo "$2: No letters specified." 2>&1
|
|
usage
|
|
fi
|
|
|
|
shift # Letters specified.
|
|
for letter in `echo $@` # For each one . . .
|
|
do
|
|
INIT_TAB_AWK="$INIT_TAB_AWK tab_search[${count_case}] = \
|
|
\"$letter\"; final_tab[${count_case}] = 0; "
|
|
# Pass as parameter to awk script below.
|
|
count_case=`expr $count_case + 1`
|
|
done
|
|
|
|
# DEBUG:
|
|
# echo $INIT_TAB_AWK;
|
|
|
|
cat $FILE_PARSE |
|
|
# Pipe the target file to the following awk script.
|
|
|
|
# ---------------------------------------------------------------------
|
|
# Earlier version of script:
|
|
# awk -v tab_search=0 -v final_tab=0 -v tab=0 -v \
|
|
# nb_letter=0 -v chara=0 -v chara2=0 \
|
|
|
|
awk \
|
|
"BEGIN { $INIT_TAB_AWK } \
|
|
{ split(\$0, tab, \"\"); \
|
|
for (chara in tab) \
|
|
{ for (chara2 in tab_search) \
|
|
{ if (tab_search[chara2] == tab[chara]) { final_tab[chara2]++ } } } } \
|
|
END { for (chara in final_tab) \
|
|
{ print tab_search[chara] \" => \" final_tab[chara] } }"
|
|
# ---------------------------------------------------------------------
|
|
# Nothing all that complicated, just . . .
|
|
#+ for-loops, if-tests, and a couple of specialized functions.
|
|
|
|
exit $?
|
|
|
|
# Compare this script to letter-count.sh.</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
></DIV
|
|
><P
|
|
>For simpler examples of awk within shell scripts, see:
|
|
<P
|
|
></P
|
|
><OL
|
|
TYPE="1"
|
|
><LI
|
|
><P
|
|
><A
|
|
HREF="internal.html#EX44"
|
|
>Example 15-14</A
|
|
></P
|
|
></LI
|
|
><LI
|
|
><P
|
|
><A
|
|
HREF="redircb.html#REDIR4"
|
|
>Example 20-8</A
|
|
></P
|
|
></LI
|
|
><LI
|
|
><P
|
|
><A
|
|
HREF="filearchiv.html#STRIPC"
|
|
>Example 16-32</A
|
|
></P
|
|
></LI
|
|
><LI
|
|
><P
|
|
><A
|
|
HREF="wrapper.html#COLTOTALER"
|
|
>Example 36-5</A
|
|
></P
|
|
></LI
|
|
><LI
|
|
><P
|
|
><A
|
|
HREF="ivr.html#COLTOTALER2"
|
|
>Example 28-2</A
|
|
></P
|
|
></LI
|
|
><LI
|
|
><P
|
|
><A
|
|
HREF="internal.html#COLTOTALER3"
|
|
>Example 15-20</A
|
|
></P
|
|
></LI
|
|
><LI
|
|
><P
|
|
><A
|
|
HREF="procref1.html#PIDID"
|
|
>Example 29-3</A
|
|
></P
|
|
></LI
|
|
><LI
|
|
><P
|
|
><A
|
|
HREF="procref1.html#CONSTAT"
|
|
>Example 29-4</A
|
|
></P
|
|
></LI
|
|
><LI
|
|
><P
|
|
><A
|
|
HREF="loops1.html#FILEINFO"
|
|
>Example 11-3</A
|
|
></P
|
|
></LI
|
|
><LI
|
|
><P
|
|
><A
|
|
HREF="extmisc.html#BLOTOUT"
|
|
>Example 16-61</A
|
|
></P
|
|
></LI
|
|
><LI
|
|
><P
|
|
><A
|
|
HREF="randomvar.html#SEEDINGRANDOM"
|
|
>Example 9-16</A
|
|
></P
|
|
></LI
|
|
><LI
|
|
><P
|
|
><A
|
|
HREF="moreadv.html#IDELETE"
|
|
>Example 16-4</A
|
|
></P
|
|
></LI
|
|
><LI
|
|
><P
|
|
><A
|
|
HREF="string-manipulation.html#SUBSTRINGEX"
|
|
>Example 10-6</A
|
|
></P
|
|
></LI
|
|
><LI
|
|
><P
|
|
><A
|
|
HREF="assortedtips.html#SUMPRODUCT"
|
|
>Example 36-19</A
|
|
></P
|
|
></LI
|
|
><LI
|
|
><P
|
|
><A
|
|
HREF="loops1.html#USERLIST"
|
|
>Example 11-9</A
|
|
></P
|
|
></LI
|
|
><LI
|
|
><P
|
|
><A
|
|
HREF="wrapper.html#PRASC"
|
|
>Example 36-4</A
|
|
></P
|
|
></LI
|
|
><LI
|
|
><P
|
|
><A
|
|
HREF="mathc.html#HYPOT"
|
|
>Example 16-53</A
|
|
></P
|
|
></LI
|
|
><LI
|
|
><P
|
|
><A
|
|
HREF="asciitable.html#ASCII3SH"
|
|
>Example T-3</A
|
|
></P
|
|
></LI
|
|
></OL
|
|
>
|
|
</P
|
|
><P
|
|
>That's all the awk we'll cover here, folks, but there's lots
|
|
more to learn. See the appropriate references in the <A
|
|
HREF="biblio.html"
|
|
><I
|
|
>Bibliography</I
|
|
></A
|
|
>.</P
|
|
></DIV
|
|
><H3
|
|
CLASS="FOOTNOTES"
|
|
>Notes</H3
|
|
><TABLE
|
|
BORDER="0"
|
|
CLASS="FOOTNOTES"
|
|
WIDTH="100%"
|
|
><TR
|
|
><TD
|
|
ALIGN="LEFT"
|
|
VALIGN="TOP"
|
|
WIDTH="5%"
|
|
><A
|
|
NAME="FTN.AEN23443"
|
|
HREF="awk.html#AEN23443"
|
|
><SPAN
|
|
CLASS="footnote"
|
|
>[1]</SPAN
|
|
></A
|
|
></TD
|
|
><TD
|
|
ALIGN="LEFT"
|
|
VALIGN="TOP"
|
|
WIDTH="95%"
|
|
><P
|
|
>Its name derives from the initials of its authors,
|
|
<B
|
|
CLASS="COMMAND"
|
|
>A</B
|
|
>ho, <B
|
|
CLASS="COMMAND"
|
|
>W</B
|
|
>einberg, and
|
|
<B
|
|
CLASS="COMMAND"
|
|
>K</B
|
|
>ernighan.</P
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
><DIV
|
|
CLASS="NAVFOOTER"
|
|
><HR
|
|
ALIGN="LEFT"
|
|
WIDTH="100%"><TABLE
|
|
SUMMARY="Footer navigation table"
|
|
WIDTH="100%"
|
|
BORDER="0"
|
|
CELLPADDING="0"
|
|
CELLSPACING="0"
|
|
><TR
|
|
><TD
|
|
WIDTH="33%"
|
|
ALIGN="left"
|
|
VALIGN="top"
|
|
><A
|
|
HREF="x23170.html"
|
|
ACCESSKEY="P"
|
|
>Prev</A
|
|
></TD
|
|
><TD
|
|
WIDTH="34%"
|
|
ALIGN="center"
|
|
VALIGN="top"
|
|
><A
|
|
HREF="index.html"
|
|
ACCESSKEY="H"
|
|
>Home</A
|
|
></TD
|
|
><TD
|
|
WIDTH="33%"
|
|
ALIGN="right"
|
|
VALIGN="top"
|
|
><A
|
|
HREF="pathmanagement.html"
|
|
ACCESSKEY="N"
|
|
>Next</A
|
|
></TD
|
|
></TR
|
|
><TR
|
|
><TD
|
|
WIDTH="33%"
|
|
ALIGN="left"
|
|
VALIGN="top"
|
|
>Sed</TD
|
|
><TD
|
|
WIDTH="34%"
|
|
ALIGN="center"
|
|
VALIGN="top"
|
|
><A
|
|
HREF="sedawk.html"
|
|
ACCESSKEY="U"
|
|
>Up</A
|
|
></TD
|
|
><TD
|
|
WIDTH="33%"
|
|
ALIGN="right"
|
|
VALIGN="top"
|
|
>Parsing and Managing Pathnames</TD
|
|
></TR
|
|
></TABLE
|
|
></DIV
|
|
></BODY
|
|
></HTML
|
|
> |