Pandoc is a
free-software document converter, widely used as a writing tool (especially by scholars)
-
-
- and as a basis for publishing
workflow
Workflow is a generic term for orchestrated and repeatable patterns of activity, enabled by the systematic organization of resources into processes that transform materials, provide services, or process information. It can be depicted as a seque ...
s. It was created by
John MacFarlane, a philosophy professor at the
University of California, Berkeley
The University of California, Berkeley (UC Berkeley, Berkeley, Cal, or California), is a Public university, public Land-grant university, land-grant research university in Berkeley, California, United States. Founded in 1868 and named after t ...
.
Functionality
Pandoc dubs itself a "markup format" converter. It can take a document in one of the supported formats and convert only its markup to another format. Maintaining the
look and feel
In software design, the look and feel of a graphical user interface comprises aspects of its design, including elements such as colors, shapes, layout, and typefaces (the "look"), as well as the behavior of dynamic elements such as buttons, boxes ...
of the document is not a priority.
Plug-ins for custom formats can also be written in
Lua, which has been used to create an exporting tool for the
Journal Article Tag Suite, for example.
CiteProc
An included
CiteProc option allows pandoc to use bibliographic data from
reference management software in any of five formats:
BibTeX
BibTeX is both a bibliographic flat-file database file format and a software program for processing these files to produce lists of references (citations). The BibTeX file format is a widely used standard with broad support by reference manage ...
,
BibLaTeX,
CSL JSON or CSL YAML, or
RIS.
The information is automatically transformed into a
citation
A citation is a reference to a source. More precisely, a citation is an abbreviated alphanumeric expression embedded in the body of an intellectual work that denotes an entry in the bibliographic references section of the work for the purpose o ...
in various styles (such as
APA,
Chicago
Chicago is the List of municipalities in Illinois, most populous city in the U.S. state of Illinois and in the Midwestern United States. With a population of 2,746,388, as of the 2020 United States census, 2020 census, it is the List of Unite ...
, or
MLA) using an implementation of the
Citation Style Language.
[ This allows the program to serve as a simpler alternative to ]LaTeX
Latex is an emulsion (stable dispersion) of polymer microparticles in water. Latices are found in nature, but synthetic latices are common as well.
In nature, latex is found as a wikt:milky, milky fluid, which is present in 10% of all floweri ...
for producing academic writing in Markdown
Markdown is a lightweight markup language for creating formatted text using a plain-text editor. John Gruber created Markdown in 2004 as an easy-to-read markup language. Markdown is widely used for blogging and instant messaging, and also used ...
with inline citation keys. Or the program can be used to convert any bibliographic data stream in the accepted formats into a list of citations in a chosen style.
Supported file formats
Input formats
The input format with the most support is an extended version of Markdown
Markdown is a lightweight markup language for creating formatted text using a plain-text editor. John Gruber created Markdown in 2004 as an easy-to-read markup language. Markdown is widely used for blogging and instant messaging, and also used ...
. Notwithstanding, pandoc can also read in the following formats:
* Creole
* DocBook
DocBook is a Semantics (computer science), semantic markup language for technical documentation. It was originally intended for writing technical documents related to computer hardware and software, but it can be used for any other sort of docume ...
* EPUB
EPUB is an e-book file format that uses the ".epub" file extension. The term is short for ''electronic publication'' and is sometimes stylized as ''ePUB''. EPUB is supported by many e-readers, and compatible software is available for most smart ...
* FictionBook (FB2)
* Haddock
The haddock (''Melanogrammus aeglefinus'') is a saltwater ray-finned fish from the Family (biology), family Gadidae, the true cods. It is the only species in the Monotypy, monotypic genus ''Melanogrammus''. It is found in the North Atlantic Oce ...
* HTML
Hypertext Markup Language (HTML) is the standard markup language for documents designed to be displayed in a web browser. It defines the content and structure of web content. It is often assisted by technologies such as Cascading Style Sheets ( ...
* Jira wiki markup
* Journal Article Tag Suite (JATS)
* JSON
JSON (JavaScript Object Notation, pronounced or ) is an open standard file format and electronic data interchange, data interchange format that uses Human-readable medium and data, human-readable text to store and transmit data objects consi ...
* LaTeX
Latex is an emulsion (stable dispersion) of polymer microparticles in water. Latices are found in nature, but synthetic latices are common as well.
In nature, latex is found as a wikt:milky, milky fluid, which is present in 10% of all floweri ...
* Lightweight markup language
A lightweight markup language (LML), also termed a simple or humane markup language, is a markup language with simple, unobtrusive syntax. It is designed to be easy to write using any generic text editor and easy to read in its raw form. Lightw ...
* man
A man is an adult male human. Before adulthood, a male child or adolescent is referred to as a boy.
Like most other male mammals, a man's genome usually inherits an X chromosome from the mother and a Y chromosome from the f ...
* Markdown
Markdown is a lightweight markup language for creating formatted text using a plain-text editor. John Gruber created Markdown in 2004 as an easy-to-read markup language. Markdown is widely used for blogging and instant messaging, and also used ...
: Strict, CommonMark, GitHub Flavored Markdown (GFM), MultiMarkdown (MMD) and Markdown Extra (PHP Extra) variants
* OpenDocument
The Open Document Format for Office Applications (ODF), also known as OpenDocument, standardized as ISO 26300, is an open file format for word processor, word processing documents, spreadsheets, Presentation program, presentations and ...
(ODT)
* OPML
* Office Open XML
Office Open XML (also informally known as OOXML) is a zipped, XML-based file format developed by Microsoft for representing spreadsheets, charts, presentations and word processing documents. Ecma International standardized the initial version ...
: Microsoft Word
Microsoft Word is a word processor program, word processing program developed by Microsoft. It was first released on October 25, 1983, under the name Multi-Tool Word for Xenix systems. Subsequent versions were later written for several other platf ...
variant
* Org-mode
Mode (also: ''org-mode''; ) is a mode for document editing, formatting, and organizing within the free software text editor GNU Emacs and its derivatives, designed for notes, planning, and authoring. The name is used to encompass plain text fi ...
* reStructuredText
reStructuredText (RST, ReST, or reST) is a file format for textual data used primarily in the Python programming language community for technical documentation.
It is part of the Docutils project of the Python Doc-SIG (Documentation Special Inte ...
* Textile
Textile is an Hyponymy and hypernymy, umbrella term that includes various Fiber, fiber-based materials, including fibers, yarns, Staple (textiles)#Filament fiber, filaments, Thread (yarn), threads, and different types of #Fabric, fabric. ...
* txt2tags
txt2tags is a document generator software that uses a lightweight markup language. txt2tags is free software under GNU General Public License.
Written in Python (programming language), Python, it can export documents to several formats includin ...
(t2t)
* Wiki markup
A wiki ( ) is a form of hypertext publication on the internet which is Collaborative editing, collaboratively edited and managed by its audience directly through a web browser. A typical wiki contains multiple pages that can either be edit ...
: MediaWiki
MediaWiki is free and open-source wiki software originally developed by Magnus Manske for use on Wikipedia on January 25, 2002, and further improved by Lee Daniel Crocker,mailarchive:wikipedia-l/2001-August/000382.html, Magnus Manske's announc ...
, Muse, TikiWiki
Tiki Wiki CMS Groupware or simply Tiki, originally known as TikiWiki, is a free and open source software, free and open source Wiki-based content management system and online office suite written primarily in PHP and distributed under the GNU Les ...
, TWiki and Vimwiki variants
Output formats
Pandoc can create files in the following output formats, which are not necessarily the same set of formats as the input formats:
* AsciiDoc
AsciiDoc is a human-readable document format, semantically equivalent to DocBook XML, but using plain text mark-up conventions. AsciiDoc documents can be created using any text editor and read “as-is”, or rendered to HTML or any other fo ...
* ConTeXt
In semiotics, linguistics, sociology and anthropology, context refers to those objects or entities which surround a ''focal event'', in these disciplines typically a communicative event, of some kind. Context is "a frame that surrounds the event ...
* DocBook
DocBook is a Semantics (computer science), semantic markup language for technical documentation. It was originally intended for writing technical documents related to computer hardware and software, but it can be used for any other sort of docume ...
: Versions 4 and 5
* EPUB
EPUB is an e-book file format that uses the ".epub" file extension. The term is short for ''electronic publication'' and is sometimes stylized as ''ePUB''. EPUB is supported by many e-readers, and compatible software is available for most smart ...
: Versions 2 and 3
* FictionBook (FB2)
* Haddock
The haddock (''Melanogrammus aeglefinus'') is a saltwater ray-finned fish from the Family (biology), family Gadidae, the true cods. It is the only species in the Monotypy, monotypic genus ''Melanogrammus''. It is found in the North Atlantic Oce ...
* HTML
Hypertext Markup Language (HTML) is the standard markup language for documents designed to be displayed in a web browser. It defines the content and structure of web content. It is often assisted by technologies such as Cascading Style Sheets ( ...
: HTML4 and HTML5
HTML5 (Hypertext Markup Language 5) is a markup language used for structuring and presenting hypertext documents on the World Wide Web. It was the fifth and final major HTML version that is now a retired World Wide Web Consortium (W3C) recommend ...
variants, respectively compliant with XHTML 1.0 Transitional and XHTML Strict
* InDesign ICML
* Jira wiki markup
* Journal Article Tag Suite (JATS)
* JSON
JSON (JavaScript Object Notation, pronounced or ) is an open standard file format and electronic data interchange, data interchange format that uses Human-readable medium and data, human-readable text to store and transmit data objects consi ...
* LaTeX
Latex is an emulsion (stable dispersion) of polymer microparticles in water. Latices are found in nature, but synthetic latices are common as well.
In nature, latex is found as a wikt:milky, milky fluid, which is present in 10% of all floweri ...
* man
A man is an adult male human. Before adulthood, a male child or adolescent is referred to as a boy.
Like most other male mammals, a man's genome usually inherits an X chromosome from the mother and a Y chromosome from the f ...
* Markdown
Markdown is a lightweight markup language for creating formatted text using a plain-text editor. John Gruber created Markdown in 2004 as an easy-to-read markup language. Markdown is widely used for blogging and instant messaging, and also used ...
: Strict, CommonMark, GitHub Flavored Markdown (GFM), MultiMarkdown (MMD) and Markdown Extra (PHP Extra) variants
* OpenDocument
The Open Document Format for Office Applications (ODF), also known as OpenDocument, standardized as ISO 26300, is an open file format for word processor, word processing documents, spreadsheets, Presentation program, presentations and ...
(ODT/ODF)
* OPML
* Office Open XML
Office Open XML (also informally known as OOXML) is a zipped, XML-based file format developed by Microsoft for representing spreadsheets, charts, presentations and word processing documents. Ecma International standardized the initial version ...
: Microsoft Word
Microsoft Word is a word processor program, word processing program developed by Microsoft. It was first released on October 25, 1983, under the name Multi-Tool Word for Xenix systems. Subsequent versions were later written for several other platf ...
and Microsoft PowerPoint
Microsoft PowerPoint is a presentation program, developed by Microsoft.
It was originally created by Robert Gaskins, Tom Rudkin, and Dennis Austin at a software company named Forethought, Inc. It was released on April 20, 1987, initially ...
variants
* Org-mode
Mode (also: ''org-mode''; ) is a mode for document editing, formatting, and organizing within the free software text editor GNU Emacs and its derivatives, designed for notes, planning, and authoring. The name is used to encompass plain text fi ...
* PDF
Portable document format (PDF), standardized as ISO 32000, is a file format developed by Adobe Inc., Adobe in 1992 to present documents, including text formatting and images, in a manner independent of application software, computer hardware, ...
(needs a third-party add-on like ConTeXt
In semiotics, linguistics, sociology and anthropology, context refers to those objects or entities which surround a ''focal event'', in these disciplines typically a communicative event, of some kind. Context is "a frame that surrounds the event ...
, pdfroff
, wkhtmltopdf
, weasyprint
or prince
)
* Plain text
In computing, plain text is a loose term for data (e.g. file contents) that represent only characters of readable material but not its graphical representation nor other objects ( floating-point numbers, images, etc.). It may also include a lim ...
* reStructuredText
reStructuredText (RST, ReST, or reST) is a file format for textual data used primarily in the Python programming language community for technical documentation.
It is part of the Docutils project of the Python Doc-SIG (Documentation Special Inte ...
* Rich Text Format
)
As an example, the following RTF code
would be rendered as follows:
This is some bold text.
Character encoding
A standard RTF file can only consist of 7-bit ASCII characters, but can use escape sequences to encode other characters. ...
(RTF)
* TEI
* Texinfo
Texinfo is a typesetting syntax used for generating documentation in both on-line and printed form (creating filetypes as , , , etc., and a specific hypertext format, ) with a single source file. It is implemented by a computer program released as ...
* Textile
Textile is an Hyponymy and hypernymy, umbrella term that includes various Fiber, fiber-based materials, including fibers, yarns, Staple (textiles)#Filament fiber, filaments, Thread (yarn), threads, and different types of #Fabric, fabric. ...
* Web-based slideshows: LaTeX Beamer, Slideous, Slidy, DZSlides, reveal.js and S5 variants[See as an example Th]
source file
is written in Markdown.
* Wiki markup
A wiki ( ) is a form of hypertext publication on the internet which is Collaborative editing, collaboratively edited and managed by its audience directly through a web browser. A typical wiki contains multiple pages that can either be edit ...
: DokuWiki, MediaWiki
MediaWiki is free and open-source wiki software originally developed by Magnus Manske for use on Wikipedia on January 25, 2002, and further improved by Lee Daniel Crocker,mailarchive:wikipedia-l/2001-August/000382.html, Magnus Manske's announc ...
, Muse
In ancient Greek religion and Greek mythology, mythology, the Muses (, ) were the Artistic inspiration, inspirational goddesses of literature, science, and the arts. They were considered the source of the knowledge embodied in the poetry, lyric p ...
, TikiWiki
Tiki Wiki CMS Groupware or simply Tiki, originally known as TikiWiki, is a free and open source software, free and open source Wiki-based content management system and online office suite written primarily in PHP and distributed under the GNU Les ...
, TWiki and Vimwiki variants
See also
* Round-trip format conversion
References
External links
*
{{Haskell programming
2006 software
File conversion software
Free software programmed in Haskell
Lightweight markup languages
Lua (programming language)-scriptable software
Technical communication tools
Workflow applications