HOME

TheInfoList



OR:

The Glasgow Haskell Compiler (GHC) is an
open-source Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use the source code, design documents, or content of the product. The open-source model is a decentralized so ...
native code In computer programming, machine code is any low-level programming language, consisting of machine language instructions, which are used to control a computer's central processing unit (CPU). Each instruction causes the CPU to perform a ver ...
compiler In computing, a compiler is a computer program that translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primarily used for programs tha ...
for the
functional programming In computer science, functional programming is a programming paradigm where programs are constructed by applying and composing functions. It is a declarative programming paradigm in which function definitions are trees of expressions tha ...
language Language is a structured system of communication. The structure of a language is its grammar and the free components are its vocabulary. Languages are the primary means by which humans communicate, and may be conveyed through a variety of ...
Haskell. It provides a cross-platform environment for the writing and testing of Haskell code and it supports numerous extensions, libraries, and optimisations that streamline the process of generating and executing code. GHC is the most commonly used Haskell compiler. The lead developers are
Simon Peyton Jones Simon Peyton Jones (born 18 January 1958) is a British computer scientist who researches the implementation and applications of functional programming languages, particularly lazy functional programming. Education Peyton Jones graduated f ...
and
Simon Marlow Simon Marlow is a British computer programmer, author, and co-developer of the Glasgow Haskell Compiler (GHC). He and Simon Peyton Jones won the SIGPLAN Programming Languages Software Award in 2011 for their work on GHC. Marlow's book Parallel ...
.


History

GHC originally started in 1989 as a prototype, written in LML (Lazy ML) by Kevin Hammond at the
University of Glasgow , image = UofG Coat of Arms.png , image_size = 150px , caption = Coat of arms Flag , latin_name = Universitas Glasguensis , motto = la, Via, Veritas, Vita , ...
. Later that year, the prototype was completely rewritten in Haskell, except for its
parser Parsing, syntax analysis, or syntactic analysis is the process of analyzing a string of symbols, either in natural language, computer languages or data structures, conforming to the rules of a formal grammar. The term ''parsing'' comes from Lat ...
, by Cordelia Hall, Will Partain, and Simon Peyton Jones. Its first beta release was on 1 April 1991 and subsequent releases added a strictness analyzer as well as language extensions such as monadic I/O, mutable arrays, unboxed data types, concurrent and parallel programming models (such as software transactional memory and data parallelism) and a profiler. Peyton Jones, as well as Marlow, later moved to
Microsoft Research Microsoft Research (MSR) is the research subsidiary of Microsoft. It was created in 1991 by Richard Rashid, Bill Gates and Nathan Myhrvold with the intent to advance state-of-the-art computing and solve difficult world problems through technolog ...
in
Cambridge, England Cambridge ( ) is a university city and the county town in Cambridgeshire, England. It is located on the River Cam approximately north of London. As of the 2021 United Kingdom census, the population of Cambridge was 145,700. Cambridge became ...
, where they continued to be primarily responsible for developing GHC. GHC also contains code from more than three hundred other contributors. Since 2009, third-party contributions to GHC have been funded by the Industrial Haskell Group.


GHC Names

Since early releases the official website has referred to GHC as ''The Glasgow Haskell Compiler'', whereas in the executable version command it is identified as ''The Glorious Glasgow Haskell Compilation System''. This has been reflected in the documentation. Initially, it had the internal name of ''The Glamorous Glasgow Haskell Compiler''.


Architecture

GHC itself is written in Haskell, but the runtime system for Haskell, essential to run programs, is written in C and C--. GHC's front end—incorporating the lexer, parser and typechecker—is designed to preserve as much information about the source language as possible until after
type inference Type inference refers to the automatic detection of the type of an expression in a formal language. These include programming languages and mathematical type systems, but also natural languages in some branches of computer science and linguistic ...
is complete, toward the goal of providing clear error messages to users. After type checking, the Haskell code is desugared into a typed intermediate language known as "Core" (based on
System F System F (also polymorphic lambda calculus or second-order lambda calculus) is a typed lambda calculus that introduces, to simply typed lambda calculus, a mechanism of universal quantification over types. System F formalizes parametric polymorph ...
, extended with let and case expressions). Core has been extended to support generalized algebraic datatypes in its
type system In computer programming, a type system is a logical system comprising a set of rules that assigns a property called a type to every "term" (a word, phrase, or other set of symbols). Usually the terms are various constructs of a computer progra ...
, and is now based on an extension to System F known as System FC. In the tradition of type-directed compilation, GHC's simplifier, or "middle end", where most of the optimizations implemented in GHC are performed, is structured as a series of source-to-source transformations on Core code. The analyses and transformations performed in this compiler stage include demand analysis (a generalization of strictness analysis), application of user-defined rewrite rules (including a set of rules included in GHC's standard libraries that performs foldr/build fusion), unfolding (called "
inlining In computing, inline expansion, or inlining, is a manual or compiler optimization that replaces a function call site with the body of the called function. Inline expansion is similar to macro expansion, but occurs during compilation, without cha ...
" in more traditional compilers), let-floating, an analysis that determines which function arguments can be unboxed, constructed product result analysis, specialization of overloaded functions, as well as a set of simpler local transformations such as constant folding and beta reduction. The back end of the compiler transforms Core code into an internal representation of C--, via an intermediate language STG (short for "Spineless Tagless G-machine"). The C-- code can then take one of three routes: it is either printed as C code for compilation with GCC, converted directly into native machine code (the traditional " code generation" phase), or converted to LLVM IR for compilation with LLVM. In all three cases, the resultant native code is finally linked against the GHC runtime system to produce an executable.


Language

GHC complies with the language standards, both ''Haskell 98'' and ''Haskell 2010''. It also supports many optional extensions to the Haskell standard: for example, the software transactional memory (STM) library, which allows for Composable Memory Transactions.


Extensions to Haskell

A number of extensions to Haskell have been proposed. These extensions provide features not described in the language specification, or they redefine existing constructs. As such, each extension may not be supported by all Haskell implementations. There is an ongoing effort to describe extensions and select those which will be included in future versions of the language specification. The extensions supported by the Glasgow Haskell Compiler include: * Unboxed types and operations. These represent the primitive datatypes of the underlying hardware, without the indirection of a pointer to the heap or the possibility of deferred evaluation. Numerically intensive code can be significantly faster when coded using these types. * The ability to specify
strict evaluation In a programming language, an evaluation strategy is a set of rules for evaluating expressions. The term is often used to refer to the more specific notion of a ''parameter-passing strategy'' that defines the kind of value that is passed to the f ...
for a value, pattern binding, or datatype field. * More convenient syntax for working with modules, patterns, list comprehensions, operators, records, and tuples. *
Syntactic sugar In computer science, syntactic sugar is syntax within a programming language that is designed to make things easier to read or to express. It makes the language "sweeter" for human use: things can be expressed more clearly, more concisely, or in an ...
for computing with arrows and recursively-defined monadic values. Both of these concepts extend the monadic -notation provided in standard Haskell. * A significantly more powerful system of types and typeclasses, described below. * Template Haskell, a system for compile-time metaprogramming. A programmer can write expressions that produce Haskell code in the form of an
abstract syntax tree In computer science, an abstract syntax tree (AST), or just syntax tree, is a tree representation of the abstract syntactic structure of text (often source code) written in a formal language. Each node of the tree denotes a construct occurr ...
. These expressions are typechecked and evaluated at compile time; the generated code is then included as if it were written directly by the programmer. Together with the ability to reflect on definitions, this provides a powerful tool for further extensions to the language. * Quasi-quotation, which allows the user to define new concrete syntax for expressions and patterns. Quasi-quotation is useful when a metaprogram written in Haskell manipulates code written in a language other than Haskell. *
Generic Generic or generics may refer to: In business * Generic term, a common name used for a range or class of similar things not protected by trademark * Generic brand, a brand for a product that does not have an associated brand or trademark, other ...
typeclasses, which specify functions solely in terms of the algebraic structure of the types they operate on. * Parallel evaluation of expressions using multiple CPU cores. This does ''not'' require explicitly spawning threads. The distribution of work happens implicitly, based on annotations provided by the programmer. * Compiler pragmas for directing optimizations such as inline expansion and specializing functions for particular types. * Customizable rewrite rules. The programmer can provide rules describing how to replace one expression with an equivalent but more efficiently evaluated expression. These are used within core datastructure libraries to provide improved performance throughout application-level code. * Record dot syntax. Provides
syntactic sugar In computer science, syntactic sugar is syntax within a programming language that is designed to make things easier to read or to express. It makes the language "sweeter" for human use: things can be expressed more clearly, more concisely, or in an ...
for accessing the fields of a (potentially nested) record which is similar to the syntax of many other programming languages.


Type system extensions

An expressive static type system is one of the major defining features of Haskell. Accordingly, much of the work in extending the language has been directed towards types and
type class In computer science, a type class is a type system construct that supports ad hoc polymorphism. This is achieved by adding constraints to type variables in parametrically polymorphic types. Such a constraint typically involves a type class T an ...
es. The Glasgow Haskell Compiler supports an extended type system based on the theoretical System FC. Major extensions to the type system include: * Arbitrary-rank and impredicative polymorphism. Essentially, a polymorphic function or datatype constructor may require that one of its arguments is itself polymorphic. * Generalized algebraic data types. Each constructor of a polymorphic datatype can encode information into the resulting type. A function which pattern-matches on this type can use the per-constructor type information to perform more specific operations on data. *
Existential type In computer programming, a type system is a logical system comprising a set of rules that assigns a property called a type to every "term" (a word, phrase, or other set of symbols). Usually the terms are various constructs of a computer progr ...
s. These can be used to "bundle" some data together with operations on that data, in such a way that the operations can be used without exposing the specific type of the underlying data. Such a value is very similar to an
object Object may refer to: General meanings * Object (philosophy), a thing, being, or concept ** Object (abstract), an object which does not exist at any particular time or place ** Physical object, an identifiable collection of matter * Goal, an ...
as found in
object-oriented programming Object-oriented programming (OOP) is a programming paradigm based on the concept of "objects", which can contain data and code. The data is in the form of fields (often known as attributes or ''properties''), and the code is in the form of ...
languages. * Data types that do not actually contain any values. These can be useful to represent data in type-level metaprogramming. * Type families: user-defined functions from types to types. Whereas parametric polymorphism provides the same structure for every type instantiation, type families provide ''ad hoc'' polymorphism with implementations that can differ between instantiations. Use cases include content-aware optimizing containers and type-level metaprogramming. * Implicit function parameters that have dynamic
scope Scope or scopes may refer to: People with the surname * Jamie Scope (born 1986), English footballer * John T. Scopes (1900–1970), central figure in the Scopes Trial regarding the teaching of evolution Arts, media, and entertainment * Cinema ...
. These are represented in types in much the same way as type class constraints. *
Linear types Substructural type systems are a family of type systems analogous to substructural logics where one or more of the structural rules are absent or only allowed under controlled circumstances. Such systems are useful for constraining access to syst ...
(GHC 9.0) Extensions relating to
type class In computer science, a type class is a type system construct that supports ad hoc polymorphism. This is achieved by adding constraints to type variables in parametrically polymorphic types. Such a constraint typically involves a type class T an ...
es include: * A type class may be parametrized on more than one type. Thus a type class can describe not only a set of types, but an ''n''-ary relation on types. * Functional dependencies, which constrain parts of that relation to be a mathematical function on types. That is, the constraint specifies that some type class parameter is completely determined once some other set of parameters is fixed. This guides the process of
type inference Type inference refers to the automatic detection of the type of an expression in a formal language. These include programming languages and mathematical type systems, but also natural languages in some branches of computer science and linguistic ...
in situations where otherwise there would be ambiguity. * Significantly relaxed rules regarding the allowable shape of type class instances. When these are enabled in full, the type class system becomes a
Turing-complete In computability theory, a system of data-manipulation rules (such as a computer's instruction set, a programming language, or a cellular automaton) is said to be Turing-complete or computationally universal if it can be used to simulate any ...
language for
logic programming Logic programming is a programming paradigm which is largely based on formal logic. Any program written in a logic programming language is a set of sentences in logical form, expressing facts and rules about some problem domain. Major logic pro ...
at compile time. * Type families, as described above, may also be associated with a type class. * The automatic generation of certain type class instances is extended in several ways. New type classes for generic programming and common recursion patterns are supported. Additionally, when a new type is declared as
isomorphic In mathematics, an isomorphism is a structure-preserving mapping between two structures of the same type that can be reversed by an inverse mapping. Two mathematical structures are isomorphic if an isomorphism exists between them. The word i ...
to an existing type, any type class instance declared for the underlying type may be lifted to the new type "for free".


Portability

Versions of GHC are available for several platforms, including
Windows Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, Windows NT for consumers, Windows Server for se ...
and most varieties of
Unix Unix (; trademarked as UNIX) is a family of multitasking, multiuser computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, ...
(such as
Linux Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, whi ...
,
FreeBSD FreeBSD is a free and open-source Unix-like operating system descended from the Berkeley Software Distribution (BSD), which was based on Research Unix. The first version of FreeBSD was released in 1993. In 2005, FreeBSD was the most popular ...
,
OpenBSD OpenBSD is a security-focused, free and open-source, Unix-like operating system based on the Berkeley Software Distribution (BSD). Theo de Raadt created OpenBSD in 1995 by forking NetBSD 1.0. According to the website, the OpenBSD project e ...
, and
macOS macOS (; previously OS X and originally Mac OS X) is a Unix operating system developed and marketed by Apple Inc. since 2001. It is the primary operating system for Apple's Mac computers. Within the market of desktop and la ...
).''Platforms'' at gitlab.haskell.org
/ref> GHC has also been ported to several different processor architectures.


See also

* Hugs * Yhc *
Haskell Platform The Haskell Platform is a collection of software packages, tools and libraries that create a common platform for using and developing applications in Haskell. With the Haskell Platform, Haskell follows the same principle as Python: "Batteries inc ...


References


External links


The GHC homepage
{{DEFAULTSORT:Ghc Cross-platform free software Free compilers and interpreters Free Haskell implementations History of computing in the United Kingdom Software using the BSD license