com.borland.primetime.editor
Class AbstractScanner

java.lang.Object
  |
  +--com.borland.primetime.editor.AbstractScanner
All Implemented Interfaces:
Scanner

public abstract class AbstractScanner
extends java.lang.Object
implements Scanner

This class is the abstract base for language scanner classes used for syntax highlighting.

The scanner operates on one line at a time, and it's built to be able to do incremental updates of the syntax information. It starts scanning on the line on which the change began, and stops scanning when it reaches a line that has no style changes from what it was in the previous scan.

Subclasses will override this class to deal with the keywords specific to the language the scanner is being built for. Note that all of the token recognition methods were built for the Java language, but all of them can be overridden in a subclass to deal with languages that have different rules. For example, if the language has different comment rules, the checkComment(...) method will also need to be overridden.

See Also:
EditorDocument.StyledLeafElement, EditorDocument.RunInfo, Scanner

Field Summary
protected  int bp
          The buffer pointer.
protected  char[] buf
          The input buffer.
protected  char ch
          The character which the scanner is currently considering.
protected  EditorDocument currentDocument
          The current document containing the lines we are parsing.
protected  int currentIndex
          The document line we're currently parsing.
protected  int endIndex
          The line of the document after which we will consider stopping parsing.
static char EOF
          The scanner adds '\0' as the EOF marker to each token stream
static int IN_COMMENT
          One of the scanner states: scanner is processing a comment.
static int IN_JAVA_DOC
          One of the scanner states: scanner is processing a JavaDoc comment.
static int NORMAL
          One of the scanner states: scanner is not in any special state
protected  int startIndex
          The line of the document where we started parsing.
protected  int stateFlags
          The scanner state after the last token was read.
 
Constructor Summary
AbstractScanner()
           
 
Method Summary
protected  int checkComment(int initialState)
          Check if the current character, in 'ch', is part of a Java comment, and if so, advance the buffer pointer past the entire comment.
protected  int checkIdentifier(int initialState)
          Check if the current character, in 'ch', is the start of a Java identifier, and if so, advance the buffer pointer past the identifier.
protected  int checkNumber(int initialState)
          Check if the current character, in 'ch', is part of a Java number, and if so, advance the buffer pointer past the entire number.
protected  int checkString(int initialState)
          Check if the current character, in 'ch', is part of a Java string, and if so, advance the buffer pointer past the entire string.
protected  int checkSymbol(int initialState)
          Check if the current character, in 'ch', is part of a Java symbol, and if so, advance the buffer pointer past the entire symbol.
protected  int checkWhitespace(int initialState)
          Check if the current character, in 'ch', is a Java whitespace character, and if so, advance the buffer pointer till the next non-whitespace character.
protected  void initialize(javax.swing.text.Segment text)
          Called before starting a scan.
protected  void initialize(javax.swing.text.Segment text, javax.swing.text.Segment text2)
          Specialized (internal) version of initialize.
protected static int initMap(java.util.HashMap map, java.lang.String[] words, boolean caseSensitive)
          Internal routine to intialize a HashMap with an array of strings.
protected abstract  boolean isExtendedKeyword(java.lang.String str)
          One of the abstract functions that a scanner derived from AbstractScanner has to implement.
protected abstract  boolean isKeyword(java.lang.String str)
          One of the abstract functions that a scanner derived from AbstractScanner has to implement.
protected  boolean isSymbol(char ch)
          The default implementation of checkSymbol will call this method to determine whether or not a particular character is a valid Java symbol.
protected  boolean isValidIdentifierPart(char ch)
          The default implementation of checkIdentifier will call this method to determine whether or not a particular character can be a part of a Java identifier.
protected  boolean isValidIdentifierStart(char ch)
          The default implementation of checkIdentifier will call this method to determine whether or not a particular character can be the start of a Java identifier.
protected  int nextToken(int initialState)
          Normally called by scanLine to read the next token.
 void parse(javax.swing.event.DocumentEvent e)
          Called externally to do a parse.
protected  boolean scanLine(EditorDocument.StyledLeafElement leaf, EditorDocument.StyledLeafElement leaf2, int initialState)
          Specialized (internal) version of scanLine.
protected  boolean scanLine(EditorDocument.StyledLeafElement leaf, int initialState)
          Called by parse to actually scan a leaf of the document (which corresponds to a line of text).
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

EOF

public static final char EOF
The scanner adds '\0' as the EOF marker to each token stream

NORMAL

public static final int NORMAL
One of the scanner states: scanner is not in any special state

IN_COMMENT

public static final int IN_COMMENT
One of the scanner states: scanner is processing a comment.

IN_JAVA_DOC

public static final int IN_JAVA_DOC
One of the scanner states: scanner is processing a JavaDoc comment.

stateFlags

protected int stateFlags
The scanner state after the last token was read. This is set by the individual check<...> methods and is passed as a parameter to each check<...> method that gets called. The state of the scanner at the end of each line is stored in the relevant line element, and at the start of each line the scanner is initialized with the state of the previous line.

buf

protected char[] buf
The input buffer.

bp

protected int bp
The buffer pointer.

ch

protected char ch
The character which the scanner is currently considering. This is usually, but not always the character in buf[bp].

currentIndex

protected int currentIndex
The document line we're currently parsing.

currentDocument

protected EditorDocument currentDocument
The current document containing the lines we are parsing.

startIndex

protected int startIndex
The line of the document where we started parsing. startIndex might be adjusted as we go along, since sometimes we want to restart the scan at a different line

endIndex

protected int endIndex
The line of the document after which we will consider stopping parsing. We will keep parsing AFTER this line as long as we find that something in a line has changed. EndIndex might be adjusted as we go along, since sometimes we want to end the scan at a different line depending on the circumstances.
Constructor Detail

AbstractScanner

public AbstractScanner()
Method Detail

initMap

protected static int initMap(java.util.HashMap map,
                             java.lang.String[] words,
                             boolean caseSensitive)
Internal routine to intialize a HashMap with an array of strings. We either store the strings as is, or we uppercase them. As an extra, the length of the longest string is returned.
Parameters:
map - the HashMap to fill
words - the words to put into the hash map as keys
caseSensitive - if true, store the words as is, if false, store the words after uppercasing them.
Returns:
the length of the longest word in the map which can be used to optimize whether to even bother looking up the word. Default return value is zero.

parse

public void parse(javax.swing.event.DocumentEvent e)
Called externally to do a parse. The starting and ending position of the parse is determined by the offset and length of the incoming DocumentEvent. The parser will parse an extra line before the line containing the starting position to handle some corner cases. The parser will parse till the EOF is reached OR a line is reached that has not changed since the last parse (as indicated by a 'false' return value of scanLine), at least IF that line is beyond the ending position. In other words, at minimum the parser will parse till the EOF or the ending position. At the end of the parse this routine will force an update of the UI.
Specified by:
parse in interface Scanner
Parameters:
e - The DocumentEvent that generated this parse call. From the offset and length of this event, the scanner determines where to start parsing and how much of the document must be processed.
See Also:
scanLine(com.borland.primetime.editor.EditorDocument.StyledLeafElement, int)

scanLine

protected boolean scanLine(EditorDocument.StyledLeafElement leaf,
                           int initialState)
Called by parse to actually scan a leaf of the document (which corresponds to a line of text). It will call nextToken until the end of the line is reached. This method handles adding the data to the RunInfo object, and assigning the RunInfo object to the leaf.
Parameters:
leaf - The leaf of the document that needs to be scanned.
initialState - The flags in effect at the start of this line. This state carries over from previous lines, and retains information such as whether the scanner should treat this line as part of a multi-line comment block.
Returns:
The return value of this method indicates whether or not parsing should continue with the next line. The parse routine depends on this because as long as scanLine returns true, parse hasn't found a line yet that hasn't changed since the last parse.

scanLine

protected boolean scanLine(EditorDocument.StyledLeafElement leaf,
                           EditorDocument.StyledLeafElement leaf2,
                           int initialState)
Specialized (internal) version of scanLine. Called by parse to actually scan two leafs of the document (which corresponds to a line of text). It will call nextToken until the end of the line is reached. This method handles adding the data to the RunInfo object, and assigning the RunInfo object to the leaf.
Parameters:
leaf - The leaf of the document that needs to be scanned.
leaf2 - The second leaf of the document that needs to be scanned.
initialState - The flags in effect at the start of this line. This state carries over from previous lines, and retains information such as whether the scanner should treat this line as part of a multi-line comment block.
Returns:
The return value of this method indicates whether or not parsing should continue with the next line. The parse routine depends on this because as long as scanLine returns true, parse hasn't found a line yet that hasn't changed since the last parse.

initialize

protected void initialize(javax.swing.text.Segment text)
Called before starting a scan. The characters of the text are loaded into the character buffer (buf), the buffer is terminated with the EOF character, and 'ch' and 'bp' are initialized.
Parameters:
text - A Segment object that contains the text that needs to be scanned.

initialize

protected void initialize(javax.swing.text.Segment text,
                          javax.swing.text.Segment text2)
Specialized (internal) version of initialize. Called before starting a scan. The characters of the texts are loaded into the character buffer (buf), the buffer is terminated with the EOF character, and 'ch' and 'bp' are initialized.
Parameters:
text - A Segment object that contains the first part of the text that needs to be scanned.
text2 - A Segment object that contains the second part of the text that needs to be scanned.

nextToken

protected int nextToken(int initialState)
Normally called by scanLine to read the next token. The various check<...> functions are called in the order dictated by the Java syntax, and as soon as a character sequence is recognized as a token, this function returns with a value indicating the token, as defined in BaseStyleMap.
Parameters:
initalState - The state of the scanner when nextToken was called.
Returns:
the type of the next token as defined in BasicStyleMap, or -1 in case of EOF.
See Also:
BasicStyleMap.CARET, BasicStyleMap.SELECTION, BasicStyleMap.INPUT_METHOD, BasicStyleMap.PLAIN, BasicStyleMap.WHITESPACE, BasicStyleMap.COMMENT, BasicStyleMap.RESERVED_WORD, BasicStyleMap.IDENTIFIER, BasicStyleMap.SYMBOL, BasicStyleMap.STRING, BasicStyleMap.NUMBER, BasicStyleMap.EXTRA_KEYWORD, BasicStyleMap.ILLEGAL, BasicStyleMap.PREPROCESSOR

isKeyword

protected abstract boolean isKeyword(java.lang.String str)
One of the abstract functions that a scanner derived from AbstractScanner has to implement. The default implementation of checkIdentifier will call this method to determine whether or not a particular string is a keyword.
Parameters:
str - The string to be checked
Returns:
true if str is a keyword, false otherwise.
See Also:
checkIdentifier(int)

isExtendedKeyword

protected abstract boolean isExtendedKeyword(java.lang.String str)
One of the abstract functions that a scanner derived from AbstractScanner has to implement. The default implementation of checkIdentifier will call this method to determine whether or not a particular string is an extended keyword.
Parameters:
str - the string to be checked
Returns:
true if str is an extended keyword, false otherwise.
See Also:
checkIdentifier(int)

isValidIdentifierStart

protected boolean isValidIdentifierStart(char ch)
The default implementation of checkIdentifier will call this method to determine whether or not a particular character can be the start of a Java identifier. A scanner derived from AbstractScanner will want to override this method if the source language is not Java.
Parameters:
ch - the char to be checked
Returns:
true if ch can be the start of a Java identifier, false otherwise.
See Also:
checkIdentifier(int)

isValidIdentifierPart

protected boolean isValidIdentifierPart(char ch)
The default implementation of checkIdentifier will call this method to determine whether or not a particular character can be a part of a Java identifier. A scanner derived from AbstractScanner will want to override this method if the source language is not Java.
Parameters:
ch - the char to be checked
Returns:
true if ch can be part of a Java identifier, false otherwise.
See Also:
checkIdentifier(int)

isSymbol

protected boolean isSymbol(char ch)
The default implementation of checkSymbol will call this method to determine whether or not a particular character is a valid Java symbol. A scanner derived from AbstractScanner will want to override this method if the source language is not Java.
Parameters:
ch - the char to be checked
Returns:
true if ch is a valid Java symbol, false otherwise.
See Also:
checkSymbol(int)

checkWhitespace

protected int checkWhitespace(int initialState)
Check if the current character, in 'ch', is a Java whitespace character, and if so, advance the buffer pointer till the next non-whitespace character. A scanner derived from AbstractScanner will want to override this method if the source language is not Java.
Parameters:
initialState - The current parser state
Returns:
The BasicStyleMap.WHITESPACE constant if the current character is whitespace, -1 otherwise.
See Also:
BasicStyleMap.WHITESPACE

checkComment

protected int checkComment(int initialState)
Check if the current character, in 'ch', is part of a Java comment, and if so, advance the buffer pointer past the entire comment. This might cause the scanner to run to the end of the buffer, the EOF mark, without finding the end of the current comment. The 'stateFlags' variable can be checked after this function is called to determine if the scanner has found the end of the comment or not. A scanner derived from AbstractScanner will want to override this method if the source language is not Java.
Parameters:
initialState - The current parser state
Returns:
The BasicStyleMap.COMMENT constant if the current character is part of a comment section, -1 otherwise.
See Also:
BasicStyleMap.COMMENT

checkIdentifier

protected int checkIdentifier(int initialState)
Check if the current character, in 'ch', is the start of a Java identifier, and if so, advance the buffer pointer past the identifier. In the default implementation, this method also checks if the identifier is a keyword or an extended keyword. A scanner derived from AbstractScanner will want to override this method if the source language is not Java.
Parameters:
initialState - The current parser state
Returns:
A constant indicating if the current character is part of a Java identifier, -1 otherwise. Returns one of: BasicStyleMap.RESERVED_WORD BasicStyleMap.EXTRA_KEYWORD BasicStyleMap.IDENTIFIER
See Also:
BasicStyleMap.RESERVED_WORD, BasicStyleMap.EXTRA_KEYWORD, BasicStyleMap.IDENTIFIER, isValidIdentifierStart(char), isValidIdentifierPart(char), isKeyword(java.lang.String), isExtendedKeyword(java.lang.String)

checkNumber

protected int checkNumber(int initialState)
Check if the current character, in 'ch', is part of a Java number, and if so, advance the buffer pointer past the entire number. A scanner derived from AbstractScanner will want to override this method if the source language is not Java.
Parameters:
initialState - The current parser state
Returns:
The BasicStyleMap.NUMBER constant if the current character is part of a number, -1 otherwise.
See Also:
BasicStyleMap.NUMBER

checkString

protected int checkString(int initialState)
Check if the current character, in 'ch', is part of a Java string, and if so, advance the buffer pointer past the entire string. A scanner derived from AbstractScanner will want to override this method if the source language is not Java.
Parameters:
initialState - The current parser state
Returns:
The BasicStyleMap.STRING constant if the current character is part of a string, -1 otherwise.
See Also:
BasicStyleMap.STRING

checkSymbol

protected int checkSymbol(int initialState)
Check if the current character, in 'ch', is part of a Java symbol, and if so, advance the buffer pointer past the entire symbol. A scanner derived from AbstractScanner will want to override this method if the source language is not Java.
Parameters:
initialState - The current parser state
Returns:
The BasicStyleMap.SYMBOL constant if the current character is part of a symbol, -1 otherwise.
See Also:
BasicStyleMap.SYMBOL, isSymbol(char)