Glasgow Haskell Compiler
   HOME

TheInfoList



OR:

The Glasgow Haskell Compiler (GHC) is a native or
machine code In computer programming, machine code is computer code consisting of machine language instructions, which are used to control a computer's central processing unit (CPU). For conventional binary computers, machine code is the binaryOn nonb ...
compiler In computing, a compiler is a computer program that Translator (computing), translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primaril ...
for the functional
programming language A programming language is a system of notation for writing computer programs. Programming languages are described in terms of their Syntax (programming languages), syntax (form) and semantics (computer science), semantics (meaning), usually def ...
Haskell Haskell () is a general-purpose, statically typed, purely functional programming language with type inference and lazy evaluation. Designed for teaching, research, and industrial applications, Haskell pioneered several programming language ...
. It provides a
cross-platform software Within computing, cross-platform software (also called multi-platform software, platform-agnostic software, or platform-independent software) is computer software that is designed to work in several computing platforms. Some cross-platform soft ...
environment for writing and testing Haskell code and supports many extensions,
libraries A library is a collection of Book, books, and possibly other Document, materials and Media (communication), media, that is accessible for use by its members and members of allied institutions. Libraries provide physical (hard copies) or electron ...
, and optimisations that streamline the process of generating and executing code. GHC is the most commonly used Haskell compiler. It is
free and open-source software Free and open-source software (FOSS) is software available under a license that grants users the right to use, modify, and distribute the software modified or not to everyone free of charge. FOSS is an inclusive umbrella term encompassing free ...
released under a
BSD license BSD licenses are a family of permissive free software licenses, imposing minimal restrictions on the use and distribution of covered software. This is in contrast to copyleft licenses, which have share-alike requirements. The original BSD lic ...
.


History

GHC originally begun in 1989 as a prototype, written in
Lazy ML Lennart Augustsson is a Sweden, Swedish computer scientist. He was formerly a lecturer at the computer science, Computing Science Department at Chalmers University of Technology. His research field is functional programming and implementations of ...
(LML) by Kevin Hammond at the
University of Glasgow The University of Glasgow (abbreviated as ''Glas.'' in Post-nominal letters, post-nominals; ) is a Public university, public research university in Glasgow, Scotland. Founded by papal bull in , it is the List of oldest universities in continuous ...
. Later that year, the prototype was completely rewritten in Haskell, except for its
parser Parsing, syntax analysis, or syntactic analysis is a process of analyzing a string of symbols, either in natural language, computer languages or data structures, conforming to the rules of a formal grammar by breaking it into parts. The term '' ...
, by Cordelia Hall, Will Partain, and Simon Peyton Jones. Its first beta release was on 1 April 1991. Later releases added a strictness analyzer and language extensions such as monadic I/O, mutable arrays, unboxed data types, concurrent and parallel programming models (such as
software transactional memory In computer science, software transactional memory (STM) is a concurrency control mechanism analogous to database transactions for controlling access to shared memory in concurrent computing. It is an alternative to lock-based synchronization. ST ...
and
data parallelism Data parallelism is parallelization across multiple processors in parallel computing environments. It focuses on distributing the data across different nodes, which operate on the data in parallel. It can be applied on regular data structures like ...
) and a profiler. Peyton Jones, and Marlow, later moved to
Microsoft Research Microsoft Research (MSR) is the research subsidiary of Microsoft. It was created in 1991 by Richard Rashid, Bill Gates and Nathan Myhrvold with the intent to advance state-of-the-art computing and solve difficult world problems through technologi ...
in
Cambridge Cambridge ( ) is a List of cities in the United Kingdom, city and non-metropolitan district in the county of Cambridgeshire, England. It is the county town of Cambridgeshire and is located on the River Cam, north of London. As of the 2021 Unit ...
, where they continued to be primarily responsible for developing GHC. GHC also contains code from more than three hundred other contributors. From 2009 to about 2014, third-party contributions to GHC were funded by the Industrial Haskell Group.


GHC names

Since early releases the official website has referred to GHC as ''The Glasgow Haskell Compiler'', whereas in the executable version command it is identified as ''The Glorious Glasgow Haskell Compilation System''. This has been reflected in the documentation. Initially, it had the internal name of ''The Glamorous Glasgow Haskell Compiler''.


Architecture

GHC is written in Haskell, but the
runtime system In computer programming, a runtime system or runtime environment is a sub-system that exists in the computer where a program is created, as well as in the computers where the program is intended to be run. The name comes from the compile time ...
for Haskell, essential to run programs, is written in C and C--. GHC's front end, incorporating the lexer, parser and typechecker, is designed to preserve as much information about the source language as possible until after
type inference Type inference, sometimes called type reconstruction, refers to the automatic detection of the type of an expression in a formal language. These include programming languages and mathematical type systems, but also natural languages in some bran ...
is complete, toward the goal of providing clear error messages to users. After type checking, the Haskell code is desugared into a typed
intermediate language An intermediate representation (IR) is the data structure or code used internally by a compiler or virtual machine to represent source code. An IR is designed to be conducive to further processing, such as optimization and translation. A "good" ...
known as "Core" (based on
System F System F (also polymorphic lambda calculus or second-order lambda calculus) is a typed lambda calculus that introduces, to simply typed lambda calculus, a mechanism of universal quantification over types. System F formalizes parametric polymorph ...
, extended with let and case expressions). Core has been extended to support generalized algebraic datatypes in its
type system In computer programming, a type system is a logical system comprising a set of rules that assigns a property called a ''type'' (for example, integer, floating point, string) to every '' term'' (a word, phrase, or other set of symbols). Usu ...
, and is now based on an extension to System F known as System FC. In the tradition of type-directed compiling, GHC's simplifier, or "middle end", where most of the optimizations implemented in GHC are performed, is structured as a series of source-to-source transformations on Core code. The analyses and transformations performed in this compiler stage include demand analysis (a generalization of strictness analysis), application of user-defined rewrite rules (including a set of rules included in GHC's standard libraries that performs foldr/build fusion), unfolding (called " inlining" in more traditional compilers), let-floating, an analysis that determines which function arguments can be unboxed, constructed product result analysis, specialization of overloaded functions, and a set of simpler local transformations such as constant folding and
beta reduction In mathematical logic, the lambda calculus (also written as ''λ''-calculus) is a formal system for expressing computation based on function abstraction and application using variable Name binding, binding and Substitution (algebra), substitution ...
. The back end of the compiler transforms Core code into an internal representation of C--, via an intermediate language STG (short for "Spineless Tagless G-machine"). The C-- code can then take one of three routes: it is either printed as C code for compilation with GCC, converted directly into native machine code (the traditional " code generation" phase), or converted to
LLVM IR LLVM, also called LLVM Core, is a target-independent optimizer and code generator. It can be used to develop a frontend for any programming language and a backend for any instruction set architecture. LLVM is designed around a language-indepen ...
for compilation with LLVM. In all three cases, the resultant native code is finally linked against the GHC runtime system to produce an executable.


Language

GHC complies with the language standards, both ''Haskell 98'' and ''Haskell 2010''. It also supports many optional extensions to the Haskell standard: for example, the
software transactional memory In computer science, software transactional memory (STM) is a concurrency control mechanism analogous to database transactions for controlling access to shared memory in concurrent computing. It is an alternative to lock-based synchronization. ST ...
(STM) library, which allows for Composable Memory Transactions.


Extensions to Haskell

Many extensions to Haskell have been proposed. These provide features not described in the language specification, or they redefine existing constructs. As such, each extension may not be supported by all Haskell implementations. There is an ongoing effort to describe extensions and select those which will be included in future versions of the language specification. The extensions supported by the Glasgow Haskell Compiler include: * Unboxed types and operations. These represent the primitive datatypes of the underlying hardware, without the indirection of a pointer to the heap or the possibility of deferred evaluation. Numerically intensive code can be significantly faster when coded using these types. * The ability to specify strict evaluation for a value, pattern binding, or datatype field. * More convenient syntax for working with modules, patterns,
list comprehension A list comprehension is a syntactic construct available in some programming languages for creating a list based on existing lists. It follows the form of the mathematical '' set-builder notation'' (''set comprehension'') as distinct from the use o ...
s, operators, records, and tuples. *
Syntactic sugar In computer science, syntactic sugar is syntax within a programming language that is designed to make things easier to read or to express. It makes the language "sweeter" for human use: things can be expressed more clearly, more concisely, or in an ...
for computing with arrows and recursively-defined monadic values. Both of these concepts extend the monadic -notation provided in standard Haskell. * A significantly more powerful system of types and typeclasses, described below. * Template Haskell, a system for compile-time
metaprogramming Metaprogramming is a computer programming technique in which computer programs have the ability to treat other programs as their data. It means that a program can be designed to read, generate, analyse, or transform other programs, and even modi ...
. Expressions can be written to produce Haskell code in the form of an
abstract syntax tree An abstract syntax tree (AST) is a data structure used in computer science to represent the structure of a program or code snippet. It is a tree representation of the abstract syntactic structure of text (often source code) written in a formal ...
. These expressions are typechecked and evaluated at compile time; the generated code is then included as if it were part of the original code. Together with the ability to reflect on definitions, this provides a powerful tool for further extensions to the language. * Quasi-quotation, which allows the user to define new concrete syntax for expressions and patterns. Quasi-quotation is useful when a metaprogram written in Haskell manipulates code written in a language other than Haskell. * Generic typeclasses, which specify functions solely in terms of the algebraic structure of the types they operate on. * Parallel evaluation of expressions using multiple CPU cores. This does ''not'' require explicitly spawning threads. The distribution of work happens implicitly, based on annotations provided in the program. * Compiler pragmas for directing optimizations such as
inline expansion In computing, inline expansion, or inlining, is a manual or compiler optimization that replaces a function call site with the body of the called function. Inline expansion is similar to macro expansion, but occurs during compiling, without cha ...
and specializing functions for particular types. * Customizable rewrite rules are rules describing how to replace one expression with an equivalent, but more efficiently evaluated expression. These are used within core
data structure In computer science, a data structure is a data organization and storage format that is usually chosen for Efficiency, efficient Data access, access to data. More precisely, a data structure is a collection of data values, the relationships amo ...
libraries to improve performance throughout application-level code. * Record dot syntax. Provides
syntactic sugar In computer science, syntactic sugar is syntax within a programming language that is designed to make things easier to read or to express. It makes the language "sweeter" for human use: things can be expressed more clearly, more concisely, or in an ...
for accessing the fields of a (potentially nested) record which is similar to the syntax of many other programming languages.


Type system extensions

An expressive static type system is one of the major defining features of Haskell. Accordingly, much of the work in extending the language has been directed towards
data type In computer science and computer programming, a data type (or simply type) is a collection or grouping of data values, usually specified by a set of possible values, a set of allowed operations on these values, and/or a representation of these ...
s and
type class In computer science, a type class is a type system construct that supports ad hoc polymorphism. This is achieved by adding constraints to type variables in parametrically polymorphic types. Such a constraint typically involves a type class T a ...
es. The Glasgow Haskell Compiler supports an extended type system based on the theoretical System FC. Major extensions to the type system include: * Arbitrary-rank and impredicative polymorphism. Essentially, a polymorphic function or datatype constructor may require that one of its arguments is also polymorphic. *
Generalized algebraic data type In functional programming, a generalized algebraic data type (GADT, also first-class phantom type, guarded recursive datatype, or equality-qualified type) is a generalization of a Parametric polymorphism, parametric algebraic data type (ADT). Ove ...
s. Each constructor of a polymorphic datatype can encode information into the resulting type. A function which pattern-matches on this type can use the per-constructor type information to perform more specific operations on data. * Existential types. These can be used to "bundle" some data together with operations on that data, in such a way that the operations can be used without exposing the specific type of the underlying data. Such a value is very similar to an
object Object may refer to: General meanings * Object (philosophy), a thing, being, or concept ** Object (abstract), an object which does not exist at any particular time or place ** Physical object, an identifiable collection of matter * Goal, an a ...
as found in
object-oriented programming Object-oriented programming (OOP) is a programming paradigm based on the concept of '' objects''. Objects can contain data (called fields, attributes or properties) and have actions they can perform (called procedures or methods and impl ...
languages. * Data types that do not actually contain any values. These can be useful to represent data in type-level
metaprogramming Metaprogramming is a computer programming technique in which computer programs have the ability to treat other programs as their data. It means that a program can be designed to read, generate, analyse, or transform other programs, and even modi ...
. * Type families: user-defined functions from types to types. Whereas parametric polymorphism provides the same structure for every type instantiation, type families provide ''ad hoc'' polymorphism with implementations that can differ between instantiations. Use cases include content-aware optimizing containers and type-level metaprogramming. * Implicit function parameters that have dynamic scope. These are represented in types in much the same way as type class constraints. * Linear types (GHC 9.0) Extensions relating to
type class In computer science, a type class is a type system construct that supports ad hoc polymorphism. This is achieved by adding constraints to type variables in parametrically polymorphic types. Such a constraint typically involves a type class T a ...
es include: * A type class may be parametrized on more than one type. Thus a type class can describe not only a set of types, but an ''n''-ary
relation Relation or relations may refer to: General uses * International relations, the study of interconnection of politics, economics, and law on a global level * Interpersonal relationship, association or acquaintance between two or more people * ...
on types. * Functional dependencies, which constrain parts of that relation to be a mathematical function on types. That is, the constraint specifies that some type class parameter is completely determined once some other set of parameters is fixed. This guides the process of
type inference Type inference, sometimes called type reconstruction, refers to the automatic detection of the type of an expression in a formal language. These include programming languages and mathematical type systems, but also natural languages in some bran ...
in situations where otherwise there would be ambiguity. * Significantly relaxed rules regarding the allowable shape of type class instances. When these are enabled in full, the type class system becomes a
Turing-complete In computability theory, a system of data-manipulation rules (such as a model of computation, a computer's instruction set, a programming language, or a cellular automaton) is said to be Turing-complete or computationally universal if it can be ...
language for
logic programming Logic programming is a programming, database and knowledge representation paradigm based on formal logic. A logic program is a set of sentences in logical form, representing knowledge about some problem domain. Computation is performed by applyin ...
at compile time. * Type families, as described above, may also be associated with a type class. * The automatic generation of certain type class instances is extended in several ways. New type classes for
generic programming Generic programming is a style of computer programming in which algorithms are written in terms of data types ''to-be-specified-later'' that are then ''instantiated'' when needed for specific types provided as parameters. This approach, pioneer ...
and common recursion patterns are supported. Also, when a new type is declared as
isomorphic In mathematics, an isomorphism is a structure-preserving mapping or morphism between two structures of the same type that can be reversed by an inverse mapping. Two mathematical structures are isomorphic if an isomorphism exists between the ...
to an existing type, any type class instance declared for the underlying type may be lifted to the new type "for free".


Portability

Versions of GHC are available for several system or
computing platform A computing platform, digital platform, or software platform is the infrastructure on which software is executed. While the individual components of a computing platform may be obfuscated under layers of abstraction, the ''summation of the requi ...
, including
Windows Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Microsoft. It is grouped into families and subfamilies that cater to particular sec ...
and most varieties of
Unix Unix (, ; trademarked as UNIX) is a family of multitasking, multi-user computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, a ...
(such as
Linux Linux ( ) is a family of open source Unix-like operating systems based on the Linux kernel, an kernel (operating system), operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically package manager, pac ...
,
FreeBSD FreeBSD is a free-software Unix-like operating system descended from the Berkeley Software Distribution (BSD). The first version was released in 1993 developed from 386BSD, one of the first fully functional and free Unix clones on affordable ...
,
OpenBSD OpenBSD is a security-focused operating system, security-focused, free software, Unix-like operating system based on the Berkeley Software Distribution (BSD). Theo de Raadt created OpenBSD in 1995 by fork (software development), forking NetBSD ...
, and
macOS macOS, previously OS X and originally Mac OS X, is a Unix, Unix-based operating system developed and marketed by Apple Inc., Apple since 2001. It is the current operating system for Apple's Mac (computer), Mac computers. With ...
).''Platforms'' at gitlab.haskell.org
/ref> GHC has also been ported to several different processor architectures.


See also

* Hugs (interpreter) * Yhc * Haskell Platform


References


External links

* {{DEFAULTSORT:Ghc Cross-platform free software Free and open source compilers Free Haskell implementations History of computing in the United Kingdom Software using the BSD license University of Glasgow