|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectmain.BetterTokenizer
The BetterTokenizer
class takes an input stream and
* parses it into "tokens", allowing the tokens to be
* read one at a time. The parsing process is controlled by a table
* and a number of flags that can be set to various states. The
* stream tokenizer can recognize identifiers, numbers, quoted
* strings, and various comment styles.
*
* Each byte read from the input stream is regarded as a character
* in the range '\u0000'
through '\u00FF'
.
* The character value is used to look up five possible attributes of
* the character: white space, alphabetic,
* numeric, string quote, and comment character.
* Each character can have zero or more of these attributes.
*
* In addition, an instance has four flags. These flags indicate: *
* A typical application first constructs an instance of this class,
* sets up the syntax tables, and then repeatedly loops calling the
* nextToken
method in each iteration of the loop until
* it returns the value TT_EOF
.
*
* @author James Gosling, modified by GvonD
* @version 1.21g, 03/04/99
* @see java.io.BetterTokenizer#nextToken()
* @see java.io.BetterTokenizer#TT_EOF
* @since JDK1.0
Field Summary | |
double |
nval
If the current token is a number, this field contains the value * of that number. |
java.lang.String |
sval
If the current token is a word token, this field contains a * string giving the characters of the word token. |
static int |
TT_EOF
* A constant indicating that the end of the stream has been read. |
static int |
TT_EOL
* A constant indicating that the end of the line has been read. |
static int |
TT_NUMBER
* A constant indicating that a number token has been read. |
static int |
TT_WORD
* A constant indicating that a word token has been read. |
int |
ttype
* After a call to the nextToken method, this field
* contains the type of the token just read. |
Constructor Summary | |
BetterTokenizer(java.io.InputStream is)
Creates a stream tokenizer that parses the specified input * stream. |
|
BetterTokenizer(java.io.Reader r)
Create a tokenizer that parses the given character stream. |
Method Summary | |
void |
commentChar(int ch)
* Specified that the character argument starts a single-line * comment. |
void |
eolIsSignificant(boolean flag)
Determines whether or not ends of line are treated as tokens. |
int |
lineno()
Return the current line number. |
void |
lowerCaseMode(boolean fl)
Determines whether or not word token are automatically lowercased. |
int |
nextToken()
* Parses the next token from the input stream of this tokenizer. |
void |
ordinaryChar(int ch)
* Specifies that the character argument is "ordinary" * in this tokenizer. |
void |
ordinaryChars(int low,
int hi)
* Specifies that all characters c in the range * low <= c <= high
* are "ordinary" in this tokenizer. |
void |
parseNumbers()
* Specifies that numbers should be parsed by this tokenizer. |
void |
pushBack()
Causes the next call to the nextToken method of this
* tokenizer to return the current value in the ttype
* field, and not to modify the value in the nval or
* sval field. |
void |
quoteChar(int ch)
* Specifies that matching pairs of this character delimit string * constants in this tokenizer. |
void |
resetSyntax()
* Resets this tokenizer's syntax table so that all characters are * "ordinary." See the ordinaryChar method
* for more information on a character being ordinary. |
void |
slashSlashComments(boolean flag)
* Determines whether or not the tokenizer recognizes C++-style comments. |
void |
slashStarComments(boolean flag)
* Determines whether or not the tokenizer recognizes C-style comments. |
java.lang.String |
toString()
Returns the string representation of the current stream token. |
void |
whitespaceChars(int low,
int hi)
* Specifies that all characters c in the range * low <= c <= high
* are white space characters. |
void |
wordChars(int low,
int hi)
* Specifies that all characters c in the range * low <= c <= high
* are word constituents. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Field Detail |
public int ttype
nextToken
method, this field
* contains the type of the token just read. For a single character
* token, its value is the single character, converted to an integer.
* For a quoted string token (see , its value is the quote character.
* Otherwise, its value is one of the following:
* TT_WORD
indicates that the token is a word.
* TT_NUMBER
indicates that the token is a number.
* TT_EOL
indicates that the end of line has been read.
* The field can only have this value if the
* eolIsSignificant
method has been called with the
* argument true
.
* TT_EOF
indicates that the end of the input stream
* has been reached.
*
public static final int TT_EOF
public static final int TT_EOL
public static final int TT_NUMBER
public static final int TT_WORD
public java.lang.String sval
* The current token is a word when the value of the
* ttype
field is TT_WORD
. The current token is
* a quoted string token when the value of the ttype
field is
* a quote character.
*
* @see java.io.BetterTokenizer#quoteChar(int)
* @see java.io.BetterTokenizer#TT_WORD
* @see java.io.BetterTokenizer#ttype
* @since JDK1.0
public double nval
ttype
field is TT_NUMBER
.
*
* @see java.io.BetterTokenizer#TT_NUMBER
* @see java.io.BetterTokenizer#ttype
Constructor Detail |
public BetterTokenizer(java.io.InputStream is)
'A'
through 'Z'
,
* 'a'
through 'z'
, and
* '\u00A0'
through '\u00FF'
are
* considered to be alphabetic.
* '\u0000'
through
* '\u0020'
are considered to be white space.
* '/'
is a comment character.
* '\''
and double quote '"'
* are string quote characters.
* *
* Reader r = new BufferedReader(new InputStreamReader(is)); * BetterTokenizer st = new BetterTokenizer(r); ** * @param is an input stream. * @see java.io.BufferedReader * @see java.io.InputStreamReader * @see java.io.BetterTokenizer#BetterTokenizer(java.io.Reader)
public BetterTokenizer(java.io.Reader r)
Method Detail |
public void resetSyntax()
ordinaryChar
method
* for more information on a character being ordinary.
*
* @see java.io.BetterTokenizer#ordinaryChar(int)
public void wordChars(int low, int hi)
low <= c <= high
* are word constituents. A word token consists of a word constituent
* followed by zero or more word constituents or number constituents.
*
* @param low the low end of the range.
* @param hi the high end of the range.
public void whitespaceChars(int low, int hi)
low <= c <= high
* are white space characters. White space characters serve only to
* separate tokens in the input stream.
*
* @param low the low end of the range.
* @param hi the high end of the range.
public void ordinaryChars(int low, int hi)
low <= c <= high
* are "ordinary" in this tokenizer. See the
* ordinaryChar
method for more information on a
* character being ordinary.
*
* @param low the low end of the range.
* @param hi the high end of the range.
* @see java.io.BetterTokenizer#ordinaryChar(int)
public void ordinaryChar(int ch)
ttype
field to the
* character value.
*
* @param ch the character.
* @see java.io.BetterTokenizer#ttype
public void commentChar(int ch)
public void quoteChar(int ch)
* When the nextToken
method encounters a string
* constant, the ttype
field is set to the string
* delimiter and the sval
field is set to the body of
* the string.
*
* If a string quote character is encountered, then a string is
* recognized, consisting of all characters after (but not including)
* the string quote character, up to (but not including) the next
* occurrence of that same string quote character, or a line
* terminator, or end of file. The usual escape sequences such as
* "\n"
and "\t"
are recognized and
* converted to single characters as the string is parsed.
*
* @param ch the character.
* @see java.io.BetterTokenizer#nextToken()
* @see java.io.BetterTokenizer#sval
* @see java.io.BetterTokenizer#ttype
public void parseNumbers()
* 0 1 2 3 4 5 6 7 8 9 . -
*
* has the "numeric" attribute. *
* When the parser encounters a word token that has the format of a
* double precision floating-point number, it treats the token as a
* number rather than a word, by setting the the ttype
* field to the value TT_NUMBER
and putting the numeric
* value of the token into the nval
field.
*
* @see java.io.BetterTokenizer#nval
* @see java.io.BetterTokenizer#TT_NUMBER
* @see java.io.BetterTokenizer#ttype
public void eolIsSignificant(boolean flag)
nextToken
method returns
* TT_EOL
and also sets the ttype
field to
* this value when an end of line is read.
*
* A line is a sequence of characters ending with either a
* carriage-return character ('\r'
) or a newline
* character ('\n'
). In addition, a carriage-return
* character followed immediately by a newline character is treated
* as a single end-of-line token.
*
* If the flag
is false, end-of-line characters are
* treated as white space and serve only to separate tokens.
*
* @param flag true
indicates that end-of-line characters
* are separate tokens; false
indicates that
* end-of-line characters are white space.
* @see java.io.BetterTokenizer#nextToken()
* @see java.io.BetterTokenizer#ttype
* @see java.io.BetterTokenizer#TT_EOL
public void slashStarComments(boolean flag)
true
, this stream tokenizer
* recognizes C-style comments. All text between successive
* occurrences of /*
and */
are discarded.
*
* If the flag argument is false
, then C-style comments
* are not treated specially.
*
* @param flag true
indicates to recognize and ignore
* C-style comments.
public void slashSlashComments(boolean flag)
true
, this stream tokenizer
* recognizes C++-style comments. Any occurrence of two consecutive
* slash characters ('/'
) is treated as the beginning of
* a comment that extends to the end of the line.
*
* If the flag argument is false
, then C++-style
* comments are not treated specially.
*
* @param flag true
indicates to recognize and ignore
* C++-style comments.
public void lowerCaseMode(boolean fl)
true
, then the value in the
* sval
field is lowercased whenever a word token is
* returned (the ttype
field has the
* value TT_WORD
by the nextToken
method
* of this tokenizer.
*
* If the flag argument is false
, then the
* sval
field is not modified.
*
* @param fl true
indicates that all word tokens should
* be lowercased.
* @see java.io.BetterTokenizer#nextToken()
* @see java.io.BetterTokenizer#ttype
* @see java.io.BetterTokenizer#TT_WORD
public int nextToken() throws java.io.IOException
ttype
* field. Additional information about the token may be in the
* nval
field or the sval
field of this
* tokenizer.
*
* Typical clients of this
* class first set up the syntax tables and then sit in a loop
* calling nextToken to parse successive tokens until TT_EOF
* is returned.
*
* @return the value of the ttype
field.
* @exception IOException if an I/O error occurs.
* @see java.io.BetterTokenizer#nval
* @see java.io.BetterTokenizer#sval
* @see java.io.BetterTokenizer#ttype
java.io.IOException
public void pushBack()
nextToken
method of this
* tokenizer to return the current value in the ttype
* field, and not to modify the value in the nval
or
* sval
field.
*
* @see java.io.BetterTokenizer#nextToken()
* @see java.io.BetterTokenizer#nval
* @see java.io.BetterTokenizer#sval
* @see java.io.BetterTokenizer#ttype
public int lineno()
public java.lang.String toString()
ttype
, nval
, and sval
* fields.
* @see java.io.BetterTokenizer#nval
* @see java.io.BetterTokenizer#sval
* @see java.io.BetterTokenizer#ttype
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |