HOME

TheInfoList



OR:

The Parser Grammar Engine (PGE, originally the Parrot Grammar Engine) is a
compiler In computing, a compiler is a computer program that Translator (computing), translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primaril ...
and
runtime system In computer programming, a runtime system or runtime environment is a sub-system that exists in the computer where a program is created, as well as in the computers where the program is intended to be run. The name comes from the compile time ...
for Raku rules for the
Parrot virtual machine Parrot is a discontinued register-based process virtual machine designed to run dynamic languages efficiently. It is possible to compile Parrot assembly language and Parrot intermediate representation (PIR, an intermediate language) to Parr ...
. PGE uses these ''rules'' to convert a
parsing expression grammar In computer science, a parsing expression grammar (PEG) is a type of analytic formal grammar, i.e. it describes a formal language in terms of a set of rules for recognizing strings in the language. The formalism was introduced by Bryan Ford in 20 ...
into Parrot
bytecode Bytecode (also called portable code or p-code) is a form of instruction set designed for efficient execution by a software interpreter. Unlike human-readable source code, bytecodes are compact numeric codes, constants, and references (normal ...
. It is therefore compiling rules into a program, unlike most virtual machines and runtimes, which store regular expressions in a secondary internal format that is then interpreted at runtime by a regular expression engine. The rules format used by PGE can express any
regular expression A regular expression (shortened as regex or regexp), sometimes referred to as rational expression, is a sequence of characters that specifies a match pattern in text. Usually such patterns are used by string-searching algorithms for "find" ...
and most
formal grammar A formal grammar is a set of Terminal and nonterminal symbols, symbols and the Production (computer science), production rules for rewriting some of them into every possible string of a formal language over an Alphabet (formal languages), alphabe ...
s, and as such it forms the first link in the compiler chain for all of Parrot's front-end languages. When executed, the bytecode generated by PGE will parse text as described in the input rules, generating a parse tree. The parse tree can be manipulated directly, or fed into the next stage of the ''Parrot compiler toolchain'' to generate an
abstract syntax tree An abstract syntax tree (AST) is a data structure used in computer science to represent the structure of a program or code snippet. It is a tree representation of the abstract syntactic structure of text (often source code) written in a formal ...
(AST) from which code can be generated; if the grammar describes a programming language.


History

Originally named ''P6GE'' and written in C, PGE was translated to native Parrot and renamed not long after its initial release in November 2004. Its author is Patrick R. Michaud. PGE was written to reduce the amount of work needed to implement a compiler on Parrot. It was also written to allow Perl 6 to easily self-host, though current Pugs development no longer uses PGE as its main rules back-end in favor of a native engine named PCR.


Internals

PGE combines three styles of parsing: * Raku rules * an operator-precedence parser * custom parse subroutines The primary form is Raku rules, so a PGE rule might look like this for an addition-only grammar: rule term rule number rule expr The operator precedence parser allows an operator table to be built and used directly in a Perl 6 rule style parser like so: rule expr is optable rule term rule number proto term: is precedence('=') is parsed(&term) proto infix:+ is looser('term:') This accomplishes the same goal of defining a simple, addition-only grammar, but does so using a combination of a Raku style regex/rules for term and number and a shift-reduce optable for everything else.


Code generation

Though PGE outputs code which will parse the grammar described by a rule, and can be used at runtime to handle simple grammars and regular expressions found in code, its main purpose is to parse
high-level programming language A high-level programming language is a programming language with strong Abstraction (computer science), abstraction from the details of the computer. In contrast to low-level programming languages, it may use natural language ''elements'', be ea ...
s. The Parrot compiler toolchain is broken into several parts, of which PGE is the first. PGE converts source code to
parse tree A parse tree or parsing tree (also known as a derivation tree or concrete syntax tree) is an ordered, rooted tree that represents the syntactic structure of a string according to some context-free grammar. The term ''parse tree'' itself is use ...
s. The ''tree grammar engine'' (TGE) then converts these into a ''Parrot abstract syntax trees'' (PAST). A second TGE pass then converts a PAST into ''Parrot
opcode In computing, an opcode (abbreviated from operation code) is an enumerated value that specifies the operation to be performed. Opcodes are employed in hardware devices such as arithmetic logic units (ALUs), central processing units (CPUs), and ...
syntax trees'' (POST) which can be directly transformed into executable bytecode.


References


External links

*{{cite web , url=http://www.pmichaud.com/2006/pres/yapc-parsers/start.html , title=Parsers, Perl 6 Rules, and the Parrot Grammar Engine , date=2006-06-28 * Perl Formal languages Pattern matching Beta software Compilers Interpreters (computing)