Class Tokenizer

java.lang.Object
mars.assembler.Tokenizer

public class Tokenizer extends Object
A tokenizer is capable of tokenizing a complete MIPS program, or a given line from a MIPS program. Since MIPS is line-oriented, each line defines a complete statement. Tokenizing is the process of analyzing the input MIPS program for the purpose of recognizing each MIPS language element. The types of language elements are known as "tokens". MIPS tokens are defined in the TokenTypes class.

Example:
The MIPS statement here: lw $t3, 8($t4) #load third member of array
generates the following token list
IDENTIFIER, COLON, OPERATOR, REGISTER_NAME, COMMA, INTEGER_5, LEFT_PAREN, REGISTER_NAME, RIGHT_PAREN, COMMENT
Version:
August 2003
Author:
Pete Sanderson
  • Constructor Details

    • Tokenizer

      public Tokenizer()
      Simple constructor. Initializes empty error list.
    • Tokenizer

      public Tokenizer(MIPSprogram program)
      Constructor for use with existing MIPSprogram. Designed to be used with Macro feature.
      Parameters:
      program - A previously-existing MIPSprogram object or null if none.
  • Method Details

    • tokenize

      public ArrayList<TokenList> tokenize(MIPSprogram p) throws ProcessingException
      Will tokenize a complete MIPS program. MIPS is line oriented (not free format), so we will be line-oriented too.
      Parameters:
      p - The MIPSprogram to be tokenized.
      Returns:
      An ArrayList representing the tokenized program. Each list member is a TokenList that represents a tokenized source statement from the MIPS program.
      Throws:
      ProcessingException
    • tokenizeExampleInstruction

      public TokenList tokenizeExampleInstruction(String example) throws ProcessingException
      Used only to create a token list for the example provided with each instruction specification.
      Parameters:
      example - The example MIPS instruction to be tokenized.
      Returns:
      An TokenList representing the tokenized instruction. Each list member is a Token that represents one language element.
      Throws:
      ProcessingException - This occurs only if the instruction specification itself contains one or more lexical (i.e. token) errors.
    • tokenizeLine

      public TokenList tokenizeLine(int lineNum, String theLine)
      Will tokenize one line of source code. If lexical errors are discovered, they are noted in an ErrorMessage object which is added to the ErrorList. Will NOT throw an exception yet because we want to persevere beyond first error.
      Parameters:
      lineNum - line number from source code (used in error message)
      theLine - String containing source code
      Returns:
      the generated token list for that line
    • tokenizeLine

      public TokenList tokenizeLine(int lineNum, String theLine, ErrorList callerErrorList)
      Will tokenize one line of source code. If lexical errors are discovered, they are noted in an ErrorMessage object which is added to the provided ErrorList instead of the Tokenizer's error list. Will NOT throw an exception.
      Parameters:
      lineNum - line number from source code (used in error message)
      theLine - String containing source code
      callerErrorList - errors will go into this list instead of tokenizer's list.
      Returns:
      the generated token list for that line
    • tokenizeLine

      public TokenList tokenizeLine(int lineNum, String theLine, ErrorList callerErrorList, boolean doEqvSubstitutes)
      Will tokenize one line of source code. If lexical errors are discovered, they are noted in an ErrorMessage object which is added to the provided ErrorList instead of the Tokenizer's error list. Will NOT throw an exception.
      Parameters:
      lineNum - line number from source code (used in error message)
      theLine - String containing source code
      callerErrorList - errors will go into this list instead of tokenizer's list.
      doEqvSubstitutes - boolean param set true to perform .eqv substitutions, else false
      Returns:
      the generated token list for that line
    • tokenizeLine

      public TokenList tokenizeLine(MIPSprogram program, int lineNum, String theLine, boolean doEqvSubstitutes)
      Will tokenize one line of source code. If lexical errors are discovered, they are noted in an ErrorMessage object which is added to the provided ErrorList instead of the Tokenizer's error list. Will NOT throw an exception.
      Parameters:
      program - MIPSprogram containing this line of source
      lineNum - line number from source code (used in error message)
      theLine - String containing source code
      doEqvSubstitutes - boolean param set true to perform .eqv substitutions, else false
      Returns:
      the generated token list for that line
    • getErrors

      public ErrorList getErrors()
      Fetch this Tokenizer's error list.
      Returns:
      the error list