DynGenPar – Dynamic Generalized Parser

by Kevin Kofler

DynGenPar is part of the FMathL project (my contributions to FMathL).

Acknowledgements: I was enrolled as a PhD student at the Faculty of Mathematics of the University of Vienna, Austria from 2007 to 2017. I successfully defended my thesis on October 13, 2017. My advisor was Prof. Arnold Neumaier, who initiated the FMathL project. DynGenPar constitutes the main output of my PhD thesis. I was also employed by the university in research assistant positions funded by the Austrian Science Fund FWF, whose financial support (under contract numbers P20631, P23554 and P22239) is gratefully acknowledged. (Other members of the FMathL project also received funding from FWF project P20631.) My current employer DAGOPT Optimization Technologies GmbH holds parts of the copyright of DynGenPar (since release 3).

DynGenPar is an innovative parser based on a new principle combining bottom-up and top-down features of traditional parsers. The most unique feature of the algorithm is the possibility to add rules to the grammar at almost any time, even during parsing. DynGenPar has the following characterizing properties:

Dynamic = The grammar is not hardcoded as in usual table-driven approaches, such as (Generalized) LR or Earley's algorithm. Instead, the algorithm works on a runtime representation of the grammar, which allows efficient handling of dynamic grammar changes. To decide when and how to shift or reduce, we use, instead of the usual LR tables, the initial graph, a graph which is easily updated as the grammar changes, along with some runtime top-down information.

Generalized = The algorithm exhaustively parses ambiguous grammars. In addition, epsilon productions are considered in order to support arbitrary CFGs. (Left recursion is naturally supported thanks to the bottom-up nature of the algorithm.) We use graph-structured (DAG-structured) stacks similar to the ones used in Tomita's Generalized LR algorithm. As additional generalizations, we also support Parallel Multiple Context-Free Grammars (PMCFGs) and next token constraints (useful, e.g. for scannerless parsing).

Due to the dynamic design, a parser generator is not needed. Instead, the parser can simply be used as a library.

DynGenPar supports dynamic grammar additions, incremental parsing, prediction, rule labels, custom parse actions, arbitrary token sources, hierarchical parsing, parallel multiple context-free grammars (PMCFGs), and next token constraints.

The implementation is licensed under the GNU General Public License, version 2 or later. A commercial license for use in non-GPL software will be offered in the near future through DAGOPT Optimization Technologies GmbH.

You can cite the following publication:
Kevin Kofler and Arnold Neumaier: DynGenPar – A Dynamic Generalized Parser for Common Mathematical Language. Proceedings of CICM 2012 (DML track), Bremen, Germany (Springer LNAI 7362), 2012.
Please include a citation like the above if you use my work in any research paper.
A preprint (final author version) can be downloaded for free below, the editor version is subject to the usual Springer subscription system.