DynGenPar
Dynamic Generalized Parser
Public Member Functions | Protected Attributes | List of all members
DynGenPar::Parser Class Reference

main class More...

Inheritance diagram for DynGenPar::Parser:
Inheritance graph

Public Member Functions

 Parser (TokenSource *tokenSource)
 
virtual ~Parser ()
 
bool isToken (CatArg cat) const
 
void addToken (CatArg cat)
 
bool isLiteral (const QList< Cat > &list) const
 is a given list of categories a literal? More...
 
void initCaches ()
 clears all caches, then computes the nullable categories and the initial graph More...
 
void addRule (CatArg cat, const Rule &rule)
 adds a new rule to the grammar, updates the nullable categories and the initial graph and clears the other caches More...
 
void loadCfg (const Cfg &cfg)
 
Cfg getCfg ()
 get a Cfg object back from the parser, for serialization More...
 
bool loadPmcfg (const Pmcfg &pmcfg)
 loads a PMCFG in standard form, converting it to the internal representation More...
 
bool addPmcfgRule (Pmcfg &pmcfg, CatArg cat, const Rule &rule)
 adds a new rule to the grammar (both the PMCFG and the internal representation), updates the nullable categories and the initial graph and clears the other caches More...
 
QList< Matchparse (int *errorPos=0, Cat *errorToken=0, int *incrementalPos=0, QList< StackItem > *incrementalStacks=0, QList< Match > *incrementalMatches=0, LexerState *lexerState=0)
 parse the input string More...
 
QList< Matchparse (ParseState *parseState)
 overloaded version using ParseState, for bindings More...
 
void saveState (ParseState *parseState)
 saves the current parser state, only meaningful during a parse operation More...
 
Predictions computePredictions (const QList< StackItem > &stacks) const
 compute a set of predictions from the stacks produced by an incremental parse More...
 
Predictions computePredictions (const ParseState &parseState) const
 overloaded version using ParseState, for bindings More...
 
QHash< Cat, QSet< Cat > > expandNonterminalPrediction (CatArg cat) const
 expand a nonterminal prediction to the possible initial tokens and the nonterminals they immediately reduce to (for categorization), using recursive descent More...
 
QHash< Cat, QSet< Cat > > expandNonterminalPredictionC (CatArg cat)
 expand a nonterminal prediction to the possible initial tokens and the nonterminals they immediately reduce to (for categorization), using recursive descent More...
 
MultiPredictions computeMultiPredictions (const QList< StackItem > &stacks) const
 compute a set of multi-token predictions from the stacks produced by an incremental parse More...
 
MultiPredictions computeMultiPredictions (const ParseState &parseState) const
 overloaded version using ParseState, for bindings More...
 
QHash< Cat, QSet< QList< Cat > > > expandNonterminalPredictionMulti (CatArg cat) const
 expand a nonterminal prediction to the possible initial nonempty literals (strings of one or more tokens) and the nonterminals they immediately reduce to (for categorization), using recursive descent More...
 
QHash< Cat, QSet< QList< Cat > > > expandNonterminalPredictionMultiC (CatArg cat)
 expand a nonterminal prediction to the possible initial nonempty literals (strings of one or more tokens) and the nonterminals they immediately reduce to (for categorization), using recursive descent More...
 
ConstrainedPredictions computeConstrainedPredictions (const QList< StackItem > &stacks) const
 compute a set of predictions from the stacks produced by an incremental parse More...
 
ConstrainedPredictions computeConstrainedPredictions (const ParseState &parseState) const
 overloaded version using ParseState, for bindings More...
 
QHash< Cat, QSet< Cat > > expandNonterminalPredictionC (CatArg cat, const NextTokenConstraints &nextTokenConstraints)
 expand a nonterminal prediction to the possible initial tokens and the nonterminals they immediately reduce to (for categorization), using recursive descent More...
 
QHash< Cat, QSet< Cat > > expandNonterminalPredictionC (CatArg cat, const QList< NextTokenConstraints > &nextTokenConstraintsList)
 expand a nonterminal prediction to the possible initial tokens and the nonterminals they immediately reduce to (for categorization), using recursive descent More...
 
ConstrainedMultiPredictions computeConstrainedMultiPredictions (const QList< StackItem > &stacks) const
 compute a set of multi-token predictions from the stacks produced by an incremental parse More...
 
ConstrainedMultiPredictions computeConstrainedMultiPredictions (const ParseState &parseState) const
 overloaded version using ParseState, for bindings More...
 
QHash< Cat, QSet< QList< Cat > > > expandNonterminalPredictionMultiC (CatArg cat, const NextTokenConstraints &nextTokenConstraints)
 expand a nonterminal prediction to the possible initial nonempty literals (strings of one or more tokens) and the nonterminals they immediately reduce to (for categorization), using recursive descent More...
 
QHash< Cat, QSet< QList< Cat > > > expandNonterminalPredictionMultiC (CatArg cat, const QList< NextTokenConstraints > &nextTokenConstraintsList)
 expand a nonterminal prediction to the possible initial nonempty literals (strings of one or more tokens) and the nonterminals they immediately reduce to (for categorization), using recursive descent More...
 

Public Attributes

grammar
RuleSet rules
 grammar rules More...
 
TokenSet tokens
 tokens More...
 
Cat startCat
 start category More...
 
additional information needed for PMCFGs
QHash< Cat, QPair< Cat, QList< Cat > > > pseudoCats
 pseudo-categories, used to represent PMCFGs internally More...
 
QHash< Cat, QPair< Cat, int > > componentCats
 maps categories which represent components of a multi-component category to the category and component index they represent More...
 
QHash< Cat, QList< Cat > > catComponents
 maps multi-component categories to the list of their components More...
 

Protected Attributes

TokenSourceinputSource
 input source More...
 

Detailed Description

main class

Definition at line 1158 of file dyngenpar.h.

Constructor & Destructor Documentation

◆ Parser()

DynGenPar::Parser::Parser ( TokenSource tokenSource)
inline

Definition at line 1161 of file dyngenpar.h.

◆ ~Parser()

virtual DynGenPar::Parser::~Parser ( )
inlinevirtual

Definition at line 1163 of file dyngenpar.h.

Member Function Documentation

◆ addPmcfgRule()

bool DynGenPar::Parser::addPmcfgRule ( Pmcfg pmcfg,
CatArg  cat,
const Rule rule 
)

adds a new rule to the grammar (both the PMCFG and the internal representation), updates the nullable categories and the initial graph and clears the other caches

Returns
true on success, false on failure
Note
Functions can be added by simply calling Pmcfg::addFunction on the pmcfg object (for named functions) or appending to the pmcfg object's Pmcfg::functions member (for unnamed functions).

Definition at line 974 of file dyngenpar.cpp.

◆ addRule()

void DynGenPar::Parser::addRule ( CatArg  cat,
const Rule rule 
)

adds a new rule to the grammar, updates the nullable categories and the initial graph and clears the other caches

Definition at line 689 of file dyngenpar.cpp.

◆ addToken()

void DynGenPar::Parser::addToken ( CatArg  cat)
inline

Definition at line 1165 of file dyngenpar.h.

◆ computeConstrainedMultiPredictions() [1/2]

ConstrainedMultiPredictions DynGenPar::Parser::computeConstrainedMultiPredictions ( const QList< StackItem > &  stacks) const

compute a set of multi-token predictions from the stacks produced by an incremental parse

Returns
a table of extended categories which are valid continuations for the current input

An extended category is either a nonterminal or a "literal", meaning a nonempty list of tokens appearing in sequence in a rule.

The table is represented as a (possibly multi-valued) hash table mapping a list of categories (containing either exactly one nonterminal or at least one terminal) to

  1. another list of categories representing the completed literal in case of a literal (e.g. if the complete literal is "abc" and the user already entered "a", the entry will have key "bc" and value "abc"), and reproducing the key otherwise,
  2. the nonterminal the literal appears in (or the nonterminal itself if the predicted category is a nonterminal) and
  3. for nonterminals, associated next token constraints.

For terminal/literal predictions, the constraints are immediately validated. For nonterminal predictions, this must be done during expansion.

Definition at line 3299 of file dyngenpar.cpp.

◆ computeConstrainedMultiPredictions() [2/2]

ConstrainedMultiPredictions DynGenPar::Parser::computeConstrainedMultiPredictions ( const ParseState parseState) const
inline

overloaded version using ParseState, for bindings

Definition at line 1240 of file dyngenpar.h.

◆ computeConstrainedPredictions() [1/2]

ConstrainedPredictions DynGenPar::Parser::computeConstrainedPredictions ( const QList< StackItem > &  stacks) const

compute a set of predictions from the stacks produced by an incremental parse

Returns
a table of (terminal or nonterminal) categories which are valid continuations for the current input and associated next token constraints

For terminal predictions, the constraints are immediately validated. For nonterminal predictions, this must be done during expansion.

Warning
This prediction method does not support multi-token literals.

Definition at line 3147 of file dyngenpar.cpp.

◆ computeConstrainedPredictions() [2/2]

ConstrainedPredictions DynGenPar::Parser::computeConstrainedPredictions ( const ParseState parseState) const
inline

overloaded version using ParseState, for bindings

Definition at line 1229 of file dyngenpar.h.

◆ computeMultiPredictions() [1/2]

MultiPredictions DynGenPar::Parser::computeMultiPredictions ( const QList< StackItem > &  stacks) const

compute a set of multi-token predictions from the stacks produced by an incremental parse

Returns
a table of extended categories which are valid continuations for the current input.

An extended category is either a nonterminal or a "literal", meaning a nonempty list of tokens appearing in sequence in a rule.

The table is represented as a (possibly multi-valued) hash table mapping a list of categories (containing either exactly one nonterminal or at least one terminal) to

  1. another list of categories representing the completed literal in case of a literal (e.g. if the complete literal is "abc" and the user already entered "a", the entry will have key "bc" and value "abc"), and reproducing the key otherwise, and
  2. the nonterminal the literal appears in (or the nonterminal itself if the predicted category is a nonterminal).
Warning
This prediction method does not support next token constraints.

Definition at line 2806 of file dyngenpar.cpp.

◆ computeMultiPredictions() [2/2]

MultiPredictions DynGenPar::Parser::computeMultiPredictions ( const ParseState parseState) const
inline

overloaded version using ParseState, for bindings

Definition at line 1218 of file dyngenpar.h.

◆ computePredictions() [1/2]

Predictions DynGenPar::Parser::computePredictions ( const QList< StackItem > &  stacks) const

compute a set of predictions from the stacks produced by an incremental parse

Returns
a set of (terminal or nonterminal) categories which are valid continuations for the current input
Warning
This prediction method does not support multi-token literals nor next token constraints.

Definition at line 2515 of file dyngenpar.cpp.

◆ computePredictions() [2/2]

Predictions DynGenPar::Parser::computePredictions ( const ParseState parseState) const
inline

overloaded version using ParseState, for bindings

Definition at line 1210 of file dyngenpar.h.

◆ expandNonterminalPrediction()

QHash< Cat, QSet< Cat > > DynGenPar::Parser::expandNonterminalPrediction ( CatArg  cat) const

expand a nonterminal prediction to the possible initial tokens and the nonterminals they immediately reduce to (for categorization), using recursive descent

only follow the leftmost branch and ignore left recursion because it does not affect the starting tokens

also expand each category only once because we do not need the full parse trees, only the last category

Warning
This expansion method does not honor context-sensitive constraints (PMCFG constraints, next token constraints) attached to the skipped epsilon matches.

Definition at line 2575 of file dyngenpar.cpp.

◆ expandNonterminalPredictionC() [1/3]

QHash< Cat, QSet< Cat > > DynGenPar::Parser::expandNonterminalPredictionC ( CatArg  cat)

expand a nonterminal prediction to the possible initial tokens and the nonterminals they immediately reduce to (for categorization), using recursive descent

only follow the leftmost branch and ignore left recursion because it does not affect the starting tokens

also match all the nullable categories encountered to epsilon, and collect and enforce any context-sensitive constraints

Definition at line 2774 of file dyngenpar.cpp.

◆ expandNonterminalPredictionC() [2/3]

QHash< Cat, QSet< Cat > > DynGenPar::Parser::expandNonterminalPredictionC ( CatArg  cat,
const NextTokenConstraints nextTokenConstraints 
)

expand a nonterminal prediction to the possible initial tokens and the nonterminals they immediately reduce to (for categorization), using recursive descent

only follow the leftmost branch and ignore left recursion because it does not affect the starting tokens

also match all the nullable categories encountered to epsilon, and collect and enforce any context-sensitive constraints

This overload also enforces the next token constraints passed as a second argument.

Definition at line 3215 of file dyngenpar.cpp.

◆ expandNonterminalPredictionC() [3/3]

QHash< Cat, QSet< Cat > > DynGenPar::Parser::expandNonterminalPredictionC ( CatArg  cat,
const QList< NextTokenConstraints > &  nextTokenConstraintsList 
)

expand a nonterminal prediction to the possible initial tokens and the nonterminals they immediately reduce to (for categorization), using recursive descent

only follow the leftmost branch and ignore left recursion because it does not affect the starting tokens

also match all the nullable categories encountered to epsilon, and collect and enforce any context-sensitive constraints

This overload also enforces the disjunctive (inclusive OR) list of next token constraints passed as a second argument, i.e. if any of the next token constraint sets in nextTokenConstraintsList matches, the prediction is accepted.

Definition at line 3242 of file dyngenpar.cpp.

◆ expandNonterminalPredictionMulti()

QHash< Cat, QSet< QList< Cat > > > DynGenPar::Parser::expandNonterminalPredictionMulti ( CatArg  cat) const

expand a nonterminal prediction to the possible initial nonempty literals (strings of one or more tokens) and the nonterminals they immediately reduce to (for categorization), using recursive descent

only follow the leftmost branch and ignore left recursion because it does not affect the starting tokens

also expand each category only once because we do not need the full parse trees, only the last category

Warning
This expansion method does not honor context-sensitive constraints (PMCFG constraints, next token constraints) attached to the skipped epsilon matches.

Definition at line 2918 of file dyngenpar.cpp.

◆ expandNonterminalPredictionMultiC() [1/3]

QHash< Cat, QSet< QList< Cat > > > DynGenPar::Parser::expandNonterminalPredictionMultiC ( CatArg  cat)

expand a nonterminal prediction to the possible initial nonempty literals (strings of one or more tokens) and the nonterminals they immediately reduce to (for categorization), using recursive descent

only follow the leftmost branch and ignore left recursion because it does not affect the starting tokens

also match all the nullable categories encountered to epsilon, and collect and enforce any context-sensitive constraints

Definition at line 3124 of file dyngenpar.cpp.

◆ expandNonterminalPredictionMultiC() [2/3]

QHash< Cat, QSet< QList< Cat > > > DynGenPar::Parser::expandNonterminalPredictionMultiC ( CatArg  cat,
const NextTokenConstraints nextTokenConstraints 
)

expand a nonterminal prediction to the possible initial nonempty literals (strings of one or more tokens) and the nonterminals they immediately reduce to (for categorization), using recursive descent

only follow the leftmost branch and ignore left recursion because it does not affect the starting tokens

also match all the nullable categories encountered to epsilon, and collect and enforce any context-sensitive constraints

This overload also enforces the next token constraints passed as a second argument.

Definition at line 3397 of file dyngenpar.cpp.

◆ expandNonterminalPredictionMultiC() [3/3]

QHash< Cat, QSet< QList< Cat > > > DynGenPar::Parser::expandNonterminalPredictionMultiC ( CatArg  cat,
const QList< NextTokenConstraints > &  nextTokenConstraintsList 
)

expand a nonterminal prediction to the possible initial nonempty literals (strings of one or more tokens) and the nonterminals they immediately reduce to (for categorization), using recursive descent

only follow the leftmost branch and ignore left recursion because it does not affect the starting tokens

also match all the nullable categories encountered to epsilon, and collect and enforce any context-sensitive constraints

This overload also enforces the disjunctive (inclusive OR) list of next token constraints passed as a second argument, i.e. if any of the next token constraint sets in nextTokenConstraintsList matches, the prediction is accepted.

Definition at line 3425 of file dyngenpar.cpp.

◆ getCfg()

Cfg DynGenPar::Parser::getCfg ( )
inline

get a Cfg object back from the parser, for serialization

Definition at line 1176 of file dyngenpar.h.

◆ initCaches()

void DynGenPar::Parser::initCaches ( void  )

clears all caches, then computes the nullable categories and the initial graph

should be called after each direct grammar modification (addRule takes care of updating the caches, which is more efficient than clearing.)

Definition at line 578 of file dyngenpar.cpp.

◆ isLiteral()

bool DynGenPar::Parser::isLiteral ( const QList< Cat > &  list) const

is a given list of categories a literal?

Returns
true if the given list of categories is a literal, i.e. contains only tokens, false otherwise

Definition at line 553 of file dyngenpar.cpp.

◆ isToken()

bool DynGenPar::Parser::isToken ( CatArg  cat) const
inline

Definition at line 1164 of file dyngenpar.h.

◆ loadCfg()

void DynGenPar::Parser::loadCfg ( const Cfg cfg)
inline

Definition at line 1169 of file dyngenpar.h.

◆ loadPmcfg()

bool DynGenPar::Parser::loadPmcfg ( const Pmcfg pmcfg)

loads a PMCFG in standard form, converting it to the internal representation

Rules containing categories which are neither tokens nor have rules are discarded, as they're unreachable and as we cannot transform them without knowing the dimension of the unused categories.

Returns
true on success, false on failure
Warning
The parser rules may be in an inconsistent state if the loading failed.

Definition at line 932 of file dyngenpar.cpp.

◆ parse() [1/2]

QList< Match > DynGenPar::Parser::parse ( int *  errorPos = 0,
Cat errorToken = 0,
int *  incrementalPos = 0,
QList< StackItem > *  incrementalStacks = 0,
QList< Match > *  incrementalMatches = 0,
LexerState lexerState = 0 
)

parse the input string

Returns
the list of matches
Parameters
[out]errorPosif non-NULL, is filled with -1 on success and with the number of accepted tokens before the error occurred on error (Caution: This might not be a good position indicator to show to the end user. For some token sources, lexerState contains a more user-centric position indicator which can be obtained through that token source's API.)
[out]errorTokenif non-NULL, is filled with 0 (epsilon) on success and with the token triggering the error on error.
[in,out]incrementalPosshould be NULL for a non-incremental parse, a pointer to a negative integer to start an incremental parse or a pointer to a nonnegative integer to continue an incremental parse. It will be set to the current end of input if non-NULL.
[in,out]incrementalStacksshould be NULL for a non-incremental, non-predictive parse or a pointer to QList<StackItem> (initially empty) for an incremental parse or when needed for prediction. It represents the internal parser states. Normally, this will be the list of stacks at the end of the parsing process. However, if an error occurred, that list would always be empty, so instead, we return the list of stacks before the token triggering the error, thus allowing to use the prediction functionality to report what token would have been expected instead of the faulty one. (An empty list of stacks means that the input was expected to end before the faulty token.)
[in,out]incrementalMatchesshould be NULL for a non-incremental parse. For an incremental parse, it can be NULL if you do not need to get your matches back a second time when there is no new input. If set to non-NULL, it will be returned as is if there is no new input in an incremental parse, or updated to the current return value otherwise.
[in,out]lexerStateif non-NULL, is filled with the lexer state at the end of the (incremental) parsing process. (In case of an error, it is filled with the lexer state where the error occurred, i.e. before shifting the faulty token, to allow reporting error positions accurately.) It is useful to allow rewinding to a previous position with a stateful lexer. It can be NULL for a non-incremental or a sequential incremental parse (i.e. if rewinding is not needed) or if a stateless token source (stateless lexer, token buffer etc.) is used. When starting a new incremental parse, the lexer state pointed to should be a null (default-constructed) LexerState. If the lexerState pointer is NULL, all rewind operations for the lexer will be passed a null (default-constructed) LexerState; stateful lexers will then fail any rewind operations.
Note
An "error" is defined as a place at which it is no longer possible to continue parsing. This does not include incomplete input, i.e. input which can be continued to valid input, but does not form valid input by itself. If an empty list of matches is returned without an error being flagged, this means that the input was incomplete. Use one of the prediction functions (e.g. computePredictions) to list possible continuations ("expected ..."). Incremental parsing can be used to process additional input given by the user.

Definition at line 2437 of file dyngenpar.cpp.

◆ parse() [2/2]

QList<Match> DynGenPar::Parser::parse ( ParseState parseState)
inline

overloaded version using ParseState, for bindings

Definition at line 1185 of file dyngenpar.h.

◆ saveState()

void DynGenPar::Parser::saveState ( ParseState parseState)
inline

saves the current parser state, only meaningful during a parse operation

This method is intended to be used in callbacks executed within parse. When called with no running parse operation, parseState will be reset.

Definition at line 1196 of file dyngenpar.h.

Member Data Documentation

◆ catComponents

QHash<Cat, QList<Cat> > DynGenPar::Parser::catComponents

maps multi-component categories to the list of their components

used during the import of PMCFG rules in the standard representation

can be left out if the internal representation is used

Definition at line 1300 of file dyngenpar.h.

◆ componentCats

QHash<Cat, QPair<Cat, int> > DynGenPar::Parser::componentCats

maps categories which represent components of a multi-component category to the category and component index they represent

also used to look up whether a category is a component of a multi-component category

Definition at line 1294 of file dyngenpar.h.

◆ inputSource

TokenSource* DynGenPar::Parser::inputSource
protected

input source

Definition at line 1305 of file dyngenpar.h.

◆ pseudoCats

QHash<Cat, QPair<Cat, QList<Cat> > > DynGenPar::Parser::pseudoCats

pseudo-categories, used to represent PMCFGs internally

maps a pseudo-category to:

  1. the actual component of the multidimensional category this pseudo-category stands for - just substituting this for the pseudo-category results in the context-free approximation of the PMCFG
  2. the list of pseudo-categories resulting from the use of the SAME argument (not just the same category) in the same context - all those pseudo-categories, when used in the same context, have to be expanded using matching rules; we use top-down expansion for all except the first encountered one to guarantee the same expansion as the one obtained while reducing the first one

Note that, when parsing a PMCFG, the initial graph and the set of nullable categories are the ones for the context-free approximation. The PMCFG constraints are only evaluated during the matching resp. reducing steps.

Also note that tokens are always 1-dimensional, so tokens may not be pseudo-categories nor the actual component for a pseudo-category.

Definition at line 1288 of file dyngenpar.h.

◆ rules

RuleSet DynGenPar::Parser::rules

grammar rules

Warning
Modifying the grammar directly requires initCaches, use addRule if possible.

Definition at line 1258 of file dyngenpar.h.

◆ startCat

Cat DynGenPar::Parser::startCat

start category

Definition at line 1262 of file dyngenpar.h.

◆ tokens

TokenSet DynGenPar::Parser::tokens

tokens

Definition at line 1260 of file dyngenpar.h.


The documentation for this class was generated from the following files: