Puma Reference Manual | Puma: Puma::Syntax Class Reference |
Syntactic analysis base class. More...
#include <Puma/Syntax.h>
Inherited by Puma::CSyntax.
Classes | |
class | State |
Parser state, the current position in the token stream. More... | |
Public Member Functions | |
CTree * | run (TokenProvider &tp) |
Start the parse process. | |
template<class T > | |
CTree * | run (TokenProvider &tp, CTree *(T::*rule)()) |
Start the parse process at a specific grammar rule. | |
virtual void | configure (Config &c) |
Configure the syntactic analysis object. | |
TokenProvider * | provider () const |
Get the token provider from which the parsed tokens are read. | |
Token * | problem () const |
Get the last token that could not be parsed. | |
bool | error () const |
Check if errors occured during the parse process. | |
bool | look_ahead (int token_type, unsigned n=1) |
Look-ahead n core language tokens and check if the n-th token has the given type. | |
bool | look_ahead (int *token_types, unsigned n=1) |
Look-ahead n core language tokens and check if the n-th token has one of the given types. | |
int | look_ahead (unsigned n=1) |
Look-ahead one core language token. | |
bool | consume () |
Consume all tokens until the next core language token. | |
Public Attributes | |
TokenProvider * | token_provider |
Token provider for getting the tokens to parse. | |
Protected Member Functions | |
Syntax (Builder &b, Semantic &s) | |
Constructor. | |
virtual | ~Syntax () |
Destructor. | |
template<class T > | |
bool | parse (CTree *(T::*rule)()) |
Parse the given grammar rule. | |
template<class T > | |
bool | seq (CTree *(T::*rule)()) |
Parse a sequence of the given grammar rule. | |
template<class T > | |
bool | seq (bool(T::*rule)()) |
Parse a sequence of the given grammar rule. | |
template<class T > | |
bool | list (CTree *(T::*rule)(), int separator, bool trailing_separator=false) |
Parse a sequence of rule-separator pairs. | |
template<class T > | |
bool | list (CTree *(T::*rule)(), int *separators, bool trailing_separator=false) |
Parse a sequence of rule-separator pairs. | |
template<class T > | |
bool | list (bool(T::*rule)(), int separator, bool trailing_separator=false) |
Parse a sequence of rule-separator pairs. | |
template<class T > | |
bool | list (bool(T::*rule)(), int *separators, bool trailing_separator=false) |
Parse a sequence of rule-separator pairs. | |
template<class T > | |
bool | catch_error (CTree *(T::*rule)(), const char *msg, int *finish_tokens, int *skip_tokens) |
Parse a grammar rule automatically catching parse errors. | |
bool | parse (int token_type) |
Parse a token with the given type. | |
bool | parse (int *token_types) |
Parse a token with one of the given types. | |
bool | parse_token (int token_type) |
Parse a token with the given type. | |
bool | opt (bool dummy) const |
Optional rule parsing. | |
Builder & | builder () const |
Get the syntax tree builder. | |
Semantic & | semantic () const |
Get the semantic analysis object. | |
virtual CTree * | trans_unit () |
Top parse rule to be reimplemented for a specific grammar. | |
virtual void | handle_directive () |
Handle a compiler directive token. | |
State | save_state () |
Save the current parser state. | |
void | forget_state () |
Forget the saved parser state. | |
void | restore_state () |
Restore the saved parser state. | |
void | restore_state (State state) |
Restore the saved parser state to the given state. | |
void | set_state (State state) |
Overwrite the parser state with the given state. | |
bool | accept (CTree *tree, State state) |
Accept the given syntax tree node. | |
CTree * | accept (CTree *tree) |
Accept the given syntax tree node. | |
Token * | locate_token () |
Skip all non-core language tokens until the next core-language token is read. | |
void | skip () |
Skip the current token. | |
void | skip_block (int start, int end) |
Skip all tokens between start and end, including start and end token. | |
void | skip_curly_block () |
Skip all tokens between '{' and '}', including '{' and '}'. | |
void | skip_round_block () |
Skip all tokens between '(' and ')', including '(' and ')'. | |
void | parse_block (int start, int end) |
Parse all tokens between start and end, including start and end token. | |
void | parse_curly_block () |
Parse all tokens between '{' and '}', including '{' and '}'. | |
void | parse_round_block () |
Parse all tokens between '(' and ')', including '(' and ')'. | |
bool | skip (int stop_token, bool inclusive=true) |
Skip all tokens until a token with the given type is read. | |
bool | skip (int *stop_tokens, bool inclusive=true) |
Skip all tokens until a token with one of the given types is read. | |
bool | is_in (int token_type, int *token_types) const |
Check if the given token type is in the set of given token types. |
Syntactic analysis base class.
Implements the top-down parsing algorithm (recursive descend parser). To be derived to implement parsers for specific grammars. Provides infinite look-ahead.
This class uses a tree builder object (see Builder) to create the syntax tree, and a semantic analysis object (see Semantic) to perform required semantic analyses of the parsed code.
The parse process is started by calling Syntax::run() with a token provider as argument. Using the token provider this method reads the first core language token from the input source code and tries to parse it by applying the top grammar rule.
return parse(&Puma::Syntax::trans_unit) ? builder().Top() : (Puma::CTree*)0;
The top grammar rule has to be provided by reimplementing method Syntax::trans_unit(). It may call sub-rules according to the implemented language-specific grammar. Example:
Puma::CTree *MySyntax::trans_unit() { return parse(&MySyntax::block_seq) ? builder().block_seq() : (Puma::CTree*)0; }
For context-sensitive grammars it may be necessary in the rules of the grammar to perform first semantic analyses of the parsed code (to differentiate ambigous syntactic constructs, resolve names, detect errors, and so one). Example:
Puma::CTree *MySyntax::block() { // '{' instruction instruction ... '}' if (parse(TOK_OPEN_CURLY)) { // parse '{' semantic().enter_block(); // enter block scope seq(&MySyntax::instruction); // parse sequence of instructions semantic().leave_block(); // leave block scope if (parse(TOK_CLOSE_CURLY)) { // parse '}' return builder().block(); // build syntax tree for the block } } return (CTree*)0; // rule failed }
If a rule could be parsed successfully the tree builder is used to create a CTree based syntax tree (fragment) for the parsed rule. Failing grammar rules shall return NULL. The result of the top grammar rule is the root node of the abstract syntax tree for the whole input source code.
Constructor.
b | The syntax tree builder. | |
s | The semantic analysis object. |
virtual Puma::Syntax::~Syntax | ( | ) | [inline, protected, virtual] |
Destructor.
Accept the given syntax tree node.
Returns the given node.
tree | Tree to accept. |
Accept the given syntax tree node.
If the node is NULL then the parser state is restored to the given state. Otherwise all saved states are discarded.
tree | Tree to accept. | |
state | The saved state. |
Builder & Puma::Syntax::builder | ( | ) | const [inline, protected] |
Get the syntax tree builder.
Reimplemented in Puma::CCSyntax.
bool Puma::Syntax::catch_error | ( | CTree *(T::*)() | rule, | |
const char * | msg, | |||
int * | finish_tokens, | |||
int * | skip_tokens | |||
) | [inline, protected] |
Parse a grammar rule automatically catching parse errors.
rule | The rule to parse. | |
msg | The error message to show if the rule fails. | |
finish_tokens | Set of token types that abort parsing the rule. | |
skip_tokens | If the rule fails skip all tokens until a token is read that has one of the types given here. |
virtual void Puma::Syntax::configure | ( | Config & | c | ) | [inline, virtual] |
Configure the syntactic analysis object.
c | The configuration object. |
Reimplemented in Puma::CCSyntax, and Puma::CSyntax.
bool Puma::Syntax::consume | ( | ) | [inline] |
Consume all tokens until the next core language token.
bool Puma::Syntax::error | ( | ) | const [inline] |
Check if errors occured during the parse process.
void Puma::Syntax::forget_state | ( | ) | [protected] |
Forget the saved parser state.
void Puma::Syntax::handle_directive | ( | ) | [inline, protected, virtual] |
Handle a compiler directive token.
The default handling is to skip the compiler directive.
Reimplemented in Puma::CSyntax.
bool Puma::Syntax::is_in | ( | int | token_type, | |
int * | token_types | |||
) | const [protected] |
Check if the given token type is in the set of given token types.
token_type | The token type to check. | |
token_types | The set of token types. |
bool Puma::Syntax::list | ( | bool(T::*)() | rule, | |
int * | separators, | |||
bool | trailing_separator = false | |||
) | [inline, protected] |
Parse a sequence of rule-separator pairs.
rule | The rule to parse at least once. | |
separators | The separator tokens. | |
trailing_separator | True if a trailing separator token is allowed. |
bool Puma::Syntax::list | ( | bool(T::*)() | rule, | |
int | separator, | |||
bool | trailing_separator = false | |||
) | [inline, protected] |
Parse a sequence of rule-separator pairs.
rule | The rule to parse at least once. | |
separator | The separator token. | |
trailing_separator | True if a trailing separator token is allowed. |
bool Puma::Syntax::list | ( | CTree *(T::*)() | rule, | |
int * | separators, | |||
bool | trailing_separator = false | |||
) | [inline, protected] |
Parse a sequence of rule-separator pairs.
rule | The rule to parse at least once. | |
separators | The separator tokens. | |
trailing_separator | True if a trailing separator token is allowed. |
bool Puma::Syntax::list | ( | CTree *(T::*)() | rule, | |
int | separator, | |||
bool | trailing_separator = false | |||
) | [inline, protected] |
Parse a sequence of rule-separator pairs.
rule | The rule to parse at least once. | |
separator | The separator token. | |
trailing_separator | True if a trailing separator token is allowed. |
Token* Puma::Syntax::locate_token | ( | ) | [protected] |
Skip all non-core language tokens until the next core-language token is read.
int Puma::Syntax::look_ahead | ( | unsigned | n = 1 |
) | [inline] |
Look-ahead one core language token.
n | The number of tokens to look-ahead. |
bool Puma::Syntax::look_ahead | ( | int * | token_types, | |
unsigned | n = 1 | |||
) |
Look-ahead n core language tokens and check if the n-th token has one of the given types.
token_types | The possible types of the n-th token. | |
n | The number of tokens to look-ahead. |
bool Puma::Syntax::look_ahead | ( | int | token_type, | |
unsigned | n = 1 | |||
) |
Look-ahead n core language tokens and check if the n-th token has the given type.
token_type | The type of the n-th token. | |
n | The number of tokens to look-ahead. |
bool Puma::Syntax::opt | ( | bool | dummy | ) | const [inline, protected] |
Optional rule parsing.
Always succeeds regardless of the argument.
dummy | Dummy parameter, is not evaluated. |
bool Puma::Syntax::parse | ( | int * | token_types | ) | [protected] |
Parse a token with one of the given types.
token_types | The token types. |
bool Puma::Syntax::parse | ( | int | token_type | ) | [inline, protected] |
Parse a token with the given type.
token_type | The token type. |
bool Puma::Syntax::parse | ( | CTree *(T::*)() | rule | ) | [inline, protected] |
Parse the given grammar rule.
Saves the current state of the builder, semantic, and token provider objects.
rule | The rule to parse. |
void Puma::Syntax::parse_block | ( | int | start, | |
int | end | |||
) | [protected] |
Parse all tokens between start and end, including start and end token.
start | The start token type. | |
end | The end token type. |
void Puma::Syntax::parse_curly_block | ( | ) | [protected] |
Parse all tokens between '{' and '}', including '{' and '}'.
void Puma::Syntax::parse_round_block | ( | ) | [protected] |
Parse all tokens between '(' and ')', including '(' and ')'.
bool Puma::Syntax::parse_token | ( | int | token_type | ) | [protected] |
Parse a token with the given type.
token_type | The token type. |
Token * Puma::Syntax::problem | ( | ) | const [inline] |
Get the last token that could not be parsed.
TokenProvider* Puma::Syntax::provider | ( | ) | const [inline] |
Get the token provider from which the parsed tokens are read.
void Puma::Syntax::restore_state | ( | State | state | ) | [protected] |
Restore the saved parser state to the given state.
Triggers restoring the syntax and semantic trees.
state | The state to which to restore. |
void Puma::Syntax::restore_state | ( | ) | [protected] |
Restore the saved parser state.
Triggers restoring the syntax and semantic trees to the saved state.
CTree * Puma::Syntax::run | ( | TokenProvider & | tp, | |
CTree *(T::*)() | rule | |||
) | [inline] |
Start the parse process at a specific grammar rule.
tp | The token provider from where to get the tokens to parse. | |
rule | The grammar rule where to start. |
CTree* Puma::Syntax::run | ( | TokenProvider & | tp | ) |
Start the parse process.
tp | The token provider from where to get the tokens to parse. |
State Puma::Syntax::save_state | ( | ) | [protected] |
Save the current parser state.
Calls save_state() on the builder, semantic, and token provider objects.
Semantic & Puma::Syntax::semantic | ( | ) | const [inline, protected] |
Get the semantic analysis object.
Reimplemented in Puma::CCSyntax.
bool Puma::Syntax::seq | ( | bool(T::*)() | rule | ) | [inline, protected] |
Parse a sequence of the given grammar rule.
rule | The rule to parse at least once. |
bool Puma::Syntax::seq | ( | CTree *(T::*)() | rule | ) | [inline, protected] |
Parse a sequence of the given grammar rule.
rule | The rule to parse at least once. |
void Puma::Syntax::set_state | ( | State | state | ) | [protected] |
Overwrite the parser state with the given state.
state | The new parser state. |
bool Puma::Syntax::skip | ( | int * | stop_tokens, | |
bool | inclusive = true | |||
) | [protected] |
Skip all tokens until a token with one of the given types is read.
stop_tokens | The types of the token to stop. | |
inclusive | If true, the stop token is skipped too. |
bool Puma::Syntax::skip | ( | int | stop_token, | |
bool | inclusive = true | |||
) | [protected] |
Skip all tokens until a token with the given type is read.
stop_token | The type of the token to stop. | |
inclusive | If true, the stop token is skipped too. |
void Puma::Syntax::skip | ( | ) | [protected] |
Skip the current token.
void Puma::Syntax::skip_block | ( | int | start, | |
int | end | |||
) | [protected] |
Skip all tokens between start and end, including start and end token.
start | The start token type. | |
end | The end token type. |
void Puma::Syntax::skip_curly_block | ( | ) | [protected] |
Skip all tokens between '{' and '}', including '{' and '}'.
void Puma::Syntax::skip_round_block | ( | ) | [protected] |
Skip all tokens between '(' and ')', including '(' and ')'.
CTree * Puma::Syntax::trans_unit | ( | ) | [inline, protected, virtual] |
Top parse rule to be reimplemented for a specific grammar.
Reimplemented in Puma::CSyntax.
Token provider for getting the tokens to parse.