HOME

TheInfoList



OR:

In
computer science Computer science is the study of computation, information, and automation. Computer science spans Theoretical computer science, theoretical disciplines (such as algorithms, theory of computation, and information theory) to Applied science, ...
, an abstract semantic
graph Graph may refer to: Mathematics *Graph (discrete mathematics), a structure made of vertices and edges **Graph theory, the study of such graphs and their properties *Graph (topology), a topological space resembling a graph in the sense of discret ...
(ASG) or term graph is a form of
abstract syntax In computer science, the abstract syntax of data is its structure described as a data type (possibly, but not necessarily, an abstract data type), independent of any particular representation or encoding. This is particularly used in the represent ...
in which an expression of a
formal Formal, formality, informal or informality imply the complying with, or not complying with, some set of requirements ( forms, in Ancient Greek). They may refer to: Dress code and events * Formal wear, attire for formal events * Semi-formal atti ...
or
programming language A programming language is a system of notation for writing computer programs. Programming languages are described in terms of their Syntax (programming languages), syntax (form) and semantics (computer science), semantics (meaning), usually def ...
is represented by a
graph Graph may refer to: Mathematics *Graph (discrete mathematics), a structure made of vertices and edges **Graph theory, the study of such graphs and their properties *Graph (topology), a topological space resembling a graph in the sense of discret ...
whose vertices are the expression's subterms. An ASG is at a higher level of abstraction than an
abstract syntax tree An abstract syntax tree (AST) is a data structure used in computer science to represent the structure of a program or code snippet. It is a tree representation of the abstract syntactic structure of text (often source code) written in a formal ...
(or AST), which is used to express the
syntactic structure In linguistics, syntax ( ) is the study of how words and morphemes combine to form larger units such as phrases and sentences. Central concerns of syntax include word order, grammatical relations, hierarchical sentence structure (constituenc ...
of an expression or program. ASGs are more complex and concise than ASTs because they may contain shared subterms (also known as "common subexpressions"). Abstract semantic graphs are often used as an
intermediate representation An intermediate representation (IR) is the data structure or code used internally by a compiler or virtual machine to represent source code. An IR is designed to be conducive to further processing, such as optimization and translation. A "good" ...
by
compilers In computing, a compiler is a computer program that translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primarily used for programs tha ...
to store the results of performing common subexpression elimination upon abstract syntax trees. ASTs are
trees In botany, a tree is a perennial plant with an elongated stem, or trunk, usually supporting branches and leaves. In some usages, the definition of a tree may be narrower, e.g., including only woody plants with secondary growth, only p ...
and are thus incapable of representing shared terms. ASGs are usually directed acyclic graphs (DAG), although in some applications graphs containing cycles may be permitted. For example, a graph containing a cycle might be used to represent the
recursive Recursion occurs when the definition of a concept or process depends on a simpler or previous version of itself. Recursion is used in a variety of disciplines ranging from linguistics to logic. The most common application of recursion is in m ...
expressions that are commonly used in
functional programming language In computer science, functional programming is a programming paradigm where programs are constructed by applying and composing functions. It is a declarative programming paradigm in which function definitions are trees of expressions that map ...
s as non- looping
iteration Iteration is the repetition of a process in order to generate a (possibly unbounded) sequence of outcomes. Each repetition of the process is a single iteration, and the outcome of each iteration is then the starting point of the next iteration. ...
constructs. The mutability of these types of graphs, is studied in the field of graph rewriting. The nomenclature ''term graph'' is associated with the field of term graph rewriting, which involves the transformation and processing of expressions by the specification of rewriting rules, whereas ''abstract semantic graph'' is used when discussing
linguistics Linguistics is the scientific study of language. The areas of linguistic analysis are syntax (rules governing the structure of sentences), semantics (meaning), Morphology (linguistics), morphology (structure of words), phonetics (speech sounds ...
,
programming languages A programming language is a system of notation for writing computer programs. Programming languages are described in terms of their syntax (form) and semantics (meaning), usually defined by a formal language. Languages usually provide features ...
,
type systems In computer programming, a type system is a logical system comprising a set of rules that assigns a property called a ''type'' (for example, integer, floating point, string) to every '' term'' (a word, phrase, or other set of symbols). Usual ...
and compilation. Abstract syntax trees are not capable of sharing subexpression nodes because it is not possible for a node in a proper tree to have more than one parent. Although this conceptual simplicity is appealing, it may come at the cost of redundant representation and, in turn, possibly inefficiently duplicating the computation of identical terms. For this reason ASGs are often used as an
intermediate language An intermediate representation (IR) is the data structure or code used internally by a compiler or virtual machine to represent source code. An IR is designed to be conducive to further processing, such as optimization and translation. A "good" ...
at a subsequent compilation stage to abstract syntax tree construction via parsing. An abstract semantic graph is typically constructed from an abstract syntax tree by a process of enrichment and abstraction. The enrichment can for example be the addition of
back-pointer In computer science, a pointer is an object in many programming languages that stores a memory address. This can be that of another value located in computer memory, or in some cases, that of memory-mapped computer hardware. A pointer ''re ...
s, edges from an
identifier An identifier is a name that identifies (that is, labels the identity of) either a unique object or a unique ''class'' of objects, where the "object" or class may be an idea, person, physical countable object (or class thereof), or physical mass ...
node (where a variable is being used) to a node representing the declaration of that variable. The abstraction can entail the removal of details which are relevant only in
parsing Parsing, syntax analysis, or syntactic analysis is a process of analyzing a String (computer science), string of Symbol (formal), symbols, either in natural language, computer languages or data structures, conforming to the rules of a formal gramm ...
, not for semantics.


Example: Code Refactoring

For example, consider the case of
code refactoring In computer programming and software design, code refactoring is the process of restructuring existing source code—changing the '' factoring''—without changing its external behavior. Refactoring is intended to improve the design, structure, ...
. To represent the implementation of a function that takes an input argument, the received parameter is conventionally given an arbitrary, distinct
name A name is a term used for identification by an external observer. They can identify a class or category of things, or a single thing, either uniquely, or within a given context. The entity identified by a name is called its referent. A person ...
in the source code so that it can be referenced. The abstract representation of this conceptual entity, a "function argument" instance, will likely be mentioned in the function signature, and also one or more times within the implementation code body. Since the function as a whole is the parent of both its header or "signature" information as well as its implementation body, an AST would not be able to use the same node to co-identify the multiple uses or appearances of the argument entity. This is solved by the DAG nature of an ASG. A key advantage of having a single, distinct node identity for any given code element is that each element's properties are, by definition, uniquely stored. This simplifies refactoring operations, because there is exactly one existential nexus for any given property instantiation. If the developer decides to change a property value such as the "name" of any code element (the "function argument" in this example), the ASG inherently exposes that value in exactly one place, and it follows that any such property changes are implicitly, trivially, and immediately propagated globally.


See also

*
Ontology (computer science) In information science, an ontology encompasses a representation, formal naming, and definitions of the categories, properties, and relations between the concepts, data, or entities that pertain to one, many, or all domains of discourse. More ...
*
Semantic Web The Semantic Web, sometimes known as Web 3.0, is an extension of the World Wide Web through standards set by the World Wide Web Consortium (W3C). The goal of the Semantic Web is to make Internet data machine-readable. To enable the encoding o ...
* Semantic Grid


References


Further reading

* * * *{{Cite conference , last=Raghavan , first=Shruti , last2=Rohana , first2=Rosanne , last3=Leon , first3=David , last4=Podgurski , first4=Andy , last5=Augustine , first5=Vinay , date=2004 , title=20th IEEE International Conference on Software Maintenance, 2004. Proceedings. , conference=IEEE International Conference on Software Maintenance , pages=188–197 , citeseerx=10.1.1.228.9292 , doi=10.1109/icsm.2004.1357803 , isbn=0-7695-2213-0 , archive-url=https://web.archive.org/web/20080117121308/http://www.citeulike.org/user/hayashi/article/259537 , archive-date=2008-01-17 , access-date=2007-05-01 , chapter-url=http://www.citeulike.org/user/hayashi/article/259537 , chapter=Dex: a semantic-graph differencing tool for studying changes in large code bases , url-status=dead Graph data structures Formal languages