-
-
Notifications
You must be signed in to change notification settings - Fork 8
Writing Semantic action rules
The Earley Algorithm parser builds an Abstract Syntax Tree as it parses. Then in semantic analysis
this tree is traversed to accomplish whatever action is needed. For
example in uncompyle6 the
action is to reproduce Python source code from a parse of (modified)
CPython bytecode instructions. In our examples
expr
and
expr2
,
the example programs, semantic analysis evaluates arithmetic
expressions.
The semantic actions class in SPARK is usually derived from the
GenericASTTraversal
class. That class provides methods for pre- and
post-order tree traversal. Most of the time pre-order traversal is
used. Since pre-order traversal also allows for hooks to be run after
all nodes complete, it can do a more than a strict pre-order
traversal.
The way to indicate a semantic action to take on an GenericASTTraversal
nonterminal is to create a method in the subclass of
GenericaASTTraversal
and preface the nonterminal name with n_
. For
example to have a semantic action for nonterminal stmt
, you would
define method n_stmt
.
After a while, you'll probably find many routines which do similar
sorts of things. For example, in the Python 2 grammar there are
grammar rules for continue
, pass
, and break
that do about the
same thing: they should start on a new line because they start a new
statement, they print their corresponding reserved word, and end the
line since they end that statement.
In uncompyle6, we often have a tree node which should descend n-1 of its children in order and separate the string fragment results from each with a comma and a space. The last node is usually an operation that is performed on prior instructions, and so that is usually handled by the grammar rule itself.
Coming back to the Python 2 grammar, you could write for the pass
statement rule:
def n_pass(self, n):
self.write(self.indent) # start a new line
self.writeln('pass') # write "pass" and finish line
and do the corresponding thing for continue
and break
.
To reduce tedium, what's generally done instead is to put rules into a table and have C-style sprintf format specifiers to indicate these common kinds of actions to take. Doing this in the Python2 grammar we instead have rules:
TABLE_DIRECT = {
'break_stmt': ( '%|break\n', ),
'continue_stmt': ( '%|continue\n', ),
'pass_stmt': ( '%|pass\n', ),
...
}
The %|
format specifier indicates starting on a new line.
The kinds of specifiers is very much tied to how one write the grammar and what kind of activity is needed. For example in uncompyle6 I often need to copy attributes from one node over to one of its siblings when I need be able to give a deparse by program offset. That kind of operation however is not needed when deparsing the entire program, so no offset information needs to be stored.
The method template_engine
in the semantic rules is where we define such table-driven rules.
See also Table-driven Semantic Actions from the uncompyle6 documentation for a more description full-blown system of table-directed semantic actions.