Parser Grammar Engine
   HOME

TheInfoList



OR:

The Parser Grammar Engine (PGE, originally the Parrot Grammar Engine) is a
compiler In computing, a compiler is a computer program that translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primarily used for programs that ...
and runtime for
Raku rules Raku may refer to: * Lake Raku, an artificial lake in Tallinn, Estonia * Raku ware, a type of pottery used in the Japanese tea ceremony * Raku, Nepal, a village in the Karnali Zone * ''RAkU'' (ballet), a ballet by Yuri Possokhov * Raku (progra ...
for the
Parrot virtual machine Parrot was a register-based process virtual machine designed to run dynamic languages efficiently. It is possible to compile Parrot assembly language and Parrot intermediate representation (PIR, an intermediate language) to Parrot bytecode ...
. PGE uses these ''rules'' to convert a
parsing expression grammar In computer science, a parsing expression grammar (PEG) is a type of analytic formal grammar, i.e. it describes a formal language in terms of a set of rules for recognizing strings in the language. The formalism was introduced by Bryan Ford in ...
into Parrot
bytecode Bytecode (also called portable code or p-code) is a form of instruction set designed for efficient execution by a software interpreter. Unlike human-readable source code, bytecodes are compact numeric codes, constants, and references (norma ...
. It is therefore compiling rules into a program, unlike most virtual machines and runtimes, which store regular expressions in a secondary internal format that is then interpreted at runtime by a regular expression engine. The rules format used by PGE can express any
regular expression A regular expression (shortened as regex or regexp; sometimes referred to as rational expression) is a sequence of characters that specifies a search pattern in text. Usually such patterns are used by string-searching algorithms for "find" ...
and most
formal grammar In formal language theory, a grammar (when the context is not given, often called a formal grammar for clarity) describes how to form strings from a language's alphabet that are valid according to the language's syntax. A grammar does not describe ...
s, and as such it forms the first link in the compiler chain for all of Parrot's front-end languages. When executed, the bytecode generated by PGE will parse text as described in the input rules, generating a parse tree. The parse tree can be manipulated directly, or fed into the next stage of the Parrot compiler toolchain in order to generate an AST from which code generation can occur (if the grammar describes a programming language).


History

Originally named ''P6GE'' and written in C, PGE was translated to native Parrot and renamed not long after its initial release in November 2004. Its author is Patrick R. Michaud. PGE was written in order to reduce the amount of work required to implement a compiler on top of Parrot. It was also written to allow Perl 6 to easily self-host, though current
Pugs The Pug is a breed of dog originally from China, with physically distinctive features of a wrinkly, short-muzzled face and curled tail. The breed has a fine, glossy coat that comes in a variety of colors, most often light brown (fawn) or blac ...
development no longer uses PGE as its primary rules back-end in favor of a native engine called PCR.


Internals

PGE combines three styles of parsing: * Raku rules * an operator precedence parser * custom parse subroutines The primary form is Raku rules, so a PGE rule might look like this for an addition-only grammar: rule term rule number rule expr The operator precedence parser allows an operator table to be built and used directly in a Perl 6 rule style parser like so: rule expr is optable rule term rule number proto term: is precedence('=') is parsed(&term) proto infix:+ is looser('term:') This accomplishes the same goal of defining a simple, addition-only grammar, but does so using a combination of a Raku style regex/rules for term and number and a shift-reduce optable for everything else.


Code generation

Though PGE outputs code which will parse the grammar described by a rule, and can be used at run time to handle simple grammars and regular expressions found in code, its primary purpose is for the parsing of
high level language In computer science, a high-level programming language is a programming language with strong abstraction from the details of the computer. In contrast to low-level programming languages, it may use natural language ''elements'', be easier to use, ...
s. The Parrot compiler toolchain is broken into several parts, of which PGE is the first. PGE converts source code to
parse tree A parse tree or parsing tree or derivation tree or concrete syntax tree is an ordered, rooted tree that represents the syntactic structure of a string according to some context-free grammar. The term ''parse tree'' itself is used primarily in comp ...
s. The Tree Grammar Engine (TGE) then converts these into Parrot Abstract Syntax Trees (PAST). A second TGE pass then converts a PAST into Parrot Opcode Syntax Trees (POST) which can be directly transformed into executable bytecode.


References


External links

*{{cite web , url=http://www.pmichaud.com/2006/pres/yapc-parsers/start.html , title=Parsers, Perl 6 Rules, and the Parrot Grammar Engine , date=2006-06-28 * Perl Formal languages Pattern matching Beta software Compilers Interpreters (computing)