118 lines
3.4 KiB
HTML
118 lines
3.4 KiB
HTML
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
|
|
<HTML>
|
|
<HEAD>
|
|
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
|
|
<META NAME="GENERATOR" CONTENT="Mozilla/4.06 [en] (X11; I; Linux 2.0.34 i686) [Netscape]">
|
|
</HEAD>
|
|
<BODY TEXT="#000000" BGCOLOR="#FFFFFF" LINK="#0000EE" VLINK="#551A8B" ALINK="#FF0000">
|
|
A regular expression that consists solely of
|
|
<UL>
|
|
<LI>
|
|
a <TT>Character</TT> matches this character.</LI>
|
|
|
|
<BR>
|
|
<LI>
|
|
a character class <TT>'[' (Character|Character'-'Character)+ ']'</TT>
|
|
matches any character in that class. A <TT>Character</TT> is to be considered
|
|
an element of a class, if it is listed in the class or if its code lies
|
|
within a listed character range <TT>Character'-'Character</TT>. So <TT>[a0-3\n]</TT>
|
|
for instance matches the characters</LI>
|
|
|
|
<BR>
|
|
<P>
|
|
<P><TT>a 0 1 2 3 \n</TT>
|
|
<BR>
|
|
<LI>
|
|
a negated character class <TT>'[^' (Character|Character'-'Character)+
|
|
']'</TT> matches all characters not listed in the class.</LI>
|
|
|
|
<BR>
|
|
<LI>
|
|
a string <TT>'"' StringCharacter+ '"</TT> <TT>'</TT> matches the
|
|
exact text enclosed in double quotes. All meta characters but <TT>\</TT>
|
|
and <TT>"</TT> loose their special meaning inside a string.</LI>
|
|
</UL>
|
|
|
|
<UL>
|
|
<LI>
|
|
a macro usage <TT>'{' Identifier '}'</TT> matches the input that is matched
|
|
by the right hand side of the macro with name "<TT>Identifier</TT>".</LI>
|
|
|
|
<BR>
|
|
<LI>
|
|
a predefined character class matches any of the characters in that
|
|
class. There are the following predefined character classes:</LI>
|
|
|
|
<BR>
|
|
<P>
|
|
<P><TT>.</TT> contains all characters but <TT>\n</TT>.</UL>
|
|
If <TT>a</TT> and <TT>b</TT> are regular expressions, then
|
|
<DL COMPACT>
|
|
<DT>
|
|
<TT>a | b</TT></DT>
|
|
|
|
<BR>
|
|
<P>
|
|
<P>(union) is the regular expression, that matches all input that is matched
|
|
by <TT>a</TT> or by <TT>b</TT>.
|
|
<DT>
|
|
<TT>a b</TT></DT>
|
|
|
|
<BR>
|
|
<P>
|
|
<P>(concatenation) is the regular expression, that matches the input matched
|
|
by <TT>a</TT> followed by the input matched by <TT>b</TT>.
|
|
<DT>
|
|
<TT>a*</TT></DT>
|
|
|
|
<BR>
|
|
<P>
|
|
<P>(kleene closure) matches zero or more repetitions of the input matched
|
|
by <TT>a</TT>
|
|
<DT>
|
|
<TT>a+</TT></DT>
|
|
|
|
<DD>
|
|
is equivalent to <TT>aa*</TT></DD>
|
|
|
|
<DT>
|
|
<TT>a?</TT></DT>
|
|
|
|
<DD>
|
|
matches the empty input or the input matched by <TT>a</TT></DD>
|
|
|
|
<DT>
|
|
<TT>a{ n}</TT></DT>
|
|
|
|
<BR>
|
|
<P>
|
|
<P>is equivalent to <TT>n</TT> times the concatenation of <TT>a</TT>. So
|
|
<TT>a{4}</TT>
|
|
for instance is equivalent to the expression <TT>a a a a</TT>. The decimal
|
|
integer <TT>n</TT> must be positive.
|
|
<DT>
|
|
<TT>a{ n,m}</TT></DT>
|
|
|
|
<BR>
|
|
<P>
|
|
<P>is equivalent to at least <TT>n</TT> times and at most <TT>m</TT> times
|
|
the concatenation of <TT>a</TT>. So <TT>a{2,4}</TT> for instance is equivalent
|
|
to the expression <TT>a a a? a?</TT>. Both <TT>n</TT> and
|
|
<TT>m</TT> are
|
|
non negative decimal integers and <TT>m</TT> must not be smaller than <TT>n</TT>.
|
|
<DT>
|
|
<TT>( a )</TT></DT>
|
|
|
|
<DD>
|
|
matches the same input as <TT>a</TT>.</DD>
|
|
</DL>
|
|
In a lexical rule, a regular expression <TT>r</TT> may be preceded by a
|
|
'<TT>^</TT>' (the beginning of line operator). <TT>r</TT> is then only
|
|
matched at the beginning of a line in the input. A line begins after each
|
|
<TT>\r|\n|\r\n</TT>
|
|
and at the beginning of input. The preceding line terminator in the input
|
|
is not consumed and can be matched by another rule.
|
|
<P> <A HREF="lopes.html#decl">Return to Using JFlex</A>
|
|
</BODY>
|
|
</HTML>
|