HOME

TheInfoList



OR:

Complex text layout (CTL) or complex text rendering is the
typesetting Typesetting is the composition of text for publication, display, or distribution by means of arranging physical ''type'' (or ''sort'') in mechanical systems or '' glyphs'' in digital systems representing '' characters'' (letters and other ...
of
writing system A writing system comprises a set of symbols, called a ''script'', as well as the rules by which the script represents a particular language. The earliest writing appeared during the late 4th millennium BC. Throughout history, each independen ...
s in which the shape or positioning of a
grapheme In linguistics, a grapheme is the smallest functional unit of a writing system. The word ''grapheme'' is derived from Ancient Greek ('write'), and the suffix ''-eme'' by analogy with ''phoneme'' and other emic units. The study of graphemes ...
depends on its relation to other graphemes. The term is used in the field of software
internationalization Internationalization or Internationalisation is the process of increasing involvement of enterprises in international markets, although there is no agreed definition of internationalization. Internationalization is a crucial strategy not only for ...
, where each grapheme is a character. Scripts which require CTL for proper display may be known as complex scripts. Examples include the
Arabic alphabet The Arabic alphabet, or the Arabic abjad, is the Arabic script as specifically codified for writing the Arabic language. It is a unicase, unicameral script written from right-to-left in a cursive style, and includes 28 letters, of which most ...
and scripts of the
Brahmic family The Brahmic scripts, also known as Indic scripts, are a family of abugida writing systems. They are used throughout South Asia, Southeast Asia and parts of East Asia. They are descended from the Brahmi script of ancient India and are used b ...
, such as
Devanagari Devanagari ( ; in script: , , ) is an Indic script used in the Indian subcontinent. It is a left-to-right abugida (a type of segmental Writing systems#Segmental systems: alphabets, writing system), based on the ancient ''Brāhmī script, Brā ...
,
Khmer script Khmer script (, )Huffman, Franklin. 1970. ''Cambodian System of Writing and Beginning Reader''. Yale University Press. . is an abugida (alphasyllabary) script used to write the Khmer language, the official language of Cambodia. It is also use ...
or the
Thai alphabet The Thai script (, , ) is the abugida used to write Thai, Southern Thai and many other languages spoken in Thailand. The Thai script itself (as used to write Thai) has 44 consonant symbols (, ), 16 vowel symbols (, ) that combine into at leas ...
. Many scripts do not require CTL. For instance, the
Latin alphabet The Latin alphabet, also known as the Roman alphabet, is the collection of letters originally used by the Ancient Rome, ancient Romans to write the Latin language. Largely unaltered except several letters splitting—i.e. from , and from � ...
or
Chinese character Chinese characters are logographs used to write the Chinese languages and others from regions historically influenced by Chinese culture. Of the four independently invented writing systems accepted by scholars, they represent the only on ...
s can be typeset by simply displaying each character one after another in straight rows or columns. However, even these scripts have alternate forms or optional features (such as
cursive Cursive (also known as joined-up writing) is any style of penmanship in which characters are written joined in a flowing manner, generally for the purpose of making writing faster, in contrast to block letters. It varies in functionality and m ...
writing) which require CTL to produce on computers.


Characteristics requiring CTL

The main characteristics of CTL complexity are: *
Bi-directional text A bidirectional text contains two text directionalities, right-to-left (RTL) and left-to-right (LTR). It generally involves text containing different types of alphabets, but may also refer to boustrophedon, which is changing text direction in ea ...
, where characters may be written from either right-to-left or left-to-right direction. * Context-sensitive shaping and
ligature Ligature may refer to: Language * Ligature (writing), a combination of two or more letters into a single symbol (typography and calligraphy) * Ligature (grammar), a morpheme that links two words Medicine * Ligature (medicine), a piece of suture us ...
s, where a character may change its shape, dependent on its location and/or the surrounding characters. For example, a character in
Arabic script The Arabic script is the writing system used for Arabic (Arabic alphabet) and several other languages of Asia and Africa. It is the second-most widely used alphabetic writing system in the world (after the Latin script), the second-most widel ...
can have as many as four different shape-forms, depending on context. * Ordering, where the displayed order of the characters is not the same as the logical order. For example, in Devanagari, which is written from left to right, the grapheme for "short i" appears to the left of ("before") the consonant that it follows: in ''ki'', the ''-i'' should render on the left, its bow reaching until above the ''k-'' to the right. Not all occurrences of these characteristics require CTL. For example, the
Greek alphabet The Greek alphabet has been used to write the Greek language since the late 9th or early 8th century BC. It was derived from the earlier Phoenician alphabet, and is the earliest known alphabetic script to systematically write vowels as wel ...
has context-sensitive shaping of the letter
sigma Sigma ( ; uppercase Σ, lowercase σ, lowercase in word-final position ς; ) is the eighteenth letter of the Greek alphabet. In the system of Greek numerals, it has a value of 200. In general mathematics, uppercase Σ is used as an operator ...
, which appears as ς at the end of a word and σ elsewhere. However, these two forms are normally stored as different characters; for instance,
Unicode Unicode or ''The Unicode Standard'' or TUS is a character encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 16.0 defines 154,998 Char ...
has both and , and does not treat them as equivalent. For collation and comparison purposes, software should consider the string "δῖος Ἀχιλλεύς" equivalent to "δῖοσ Ἀχιλλεύσ", but for typesetting purposes they are distinct and CTL is not required to choose the correct form.


Implementations

Most text-rendering software that is capable of CTL will include information about specific scripts, and so will be able to render them correctly without font files needing to supply instructions on how to lay out characters. Such software is usually provided in a
library A library is a collection of Book, books, and possibly other Document, materials and Media (communication), media, that is accessible for use by its members and members of allied institutions. Libraries provide physical (hard copies) or electron ...
; examples include: * Core Text for
macOS macOS, previously OS X and originally Mac OS X, is a Unix, Unix-based operating system developed and marketed by Apple Inc., Apple since 2001. It is the current operating system for Apple's Mac (computer), Mac computers. With ...
*
Uniscribe Uniscribe is the Microsoft Windows set of services for rendering Unicode-encoded text, supporting complex text layout. It is implemented in the dynamic link library . Uniscribe was released with Windows 2000 and Internet Explorer 5.0. In addition ...
(with Universal Shaping Engine) and
DirectWrite DirectWrite is a text layout and glyph rendering API by Microsoft. It was designed to replace GDI/GDI+ and Uniscribe for screen-oriented rendering and was first shipped with Windows 7 and Windows Server 2008 R2, as well as Windows Vista and Wi ...
for
Microsoft Windows Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Microsoft. It is grouped into families and subfamilies that cater to particular sec ...
*
HarfBuzz HarfBuzz (loose transliteration of Persian language, Persian calque ''harf-bāz'', literally "open type") is a software library for supporting text shaping, which is the process of converting Unicode text to glyph indices and positions. The ne ...
, a
cross-platform Within computing, cross-platform software (also called multi-platform software, platform-agnostic software, or platform-independent software) is computer software that is designed to work in several Computing platform, computing platforms. Some ...
library * Pango, a cross-platform library which nowadays incorporates
HarfBuzz HarfBuzz (loose transliteration of Persian language, Persian calque ''harf-bāz'', literally "open type") is a software library for supporting text shaping, which is the process of converting Unicode text to glyph indices and positions. The ne ...
However, such software is unable to properly render any script for which it lacks instructions, which can include many minority scripts. The alternative approach is to include the rendering instructions in the font file itself. Rendering software still needs to be capable of reading and following the instructions, but this is relatively simple. Examples of this latter approach include
Apple Advanced Typography Apple Advanced Typography (AAT) is Apple Inc.'s computer technology for advanced font rendering, supporting internationalization and complex features for typographers, a successor to Apple's little-used QuickDraw GX font technology of the mid ...
(AAT) and
Graphite Graphite () is a Crystallinity, crystalline allotrope (form) of the element carbon. It consists of many stacked Layered materials, layers of graphene, typically in excess of hundreds of layers. Graphite occurs naturally and is the most stable ...
. Both of these names encompass both the instruction format and the software supporting it; AAT is included on
Apple An apple is a round, edible fruit produced by an apple tree (''Malus'' spp.). Fruit trees of the orchard or domestic apple (''Malus domestica''), the most widely grown in the genus, are agriculture, cultivated worldwide. The tree originated ...
operating system An operating system (OS) is system software that manages computer hardware and software resources, and provides common daemon (computing), services for computer programs. Time-sharing operating systems scheduler (computing), schedule tasks for ...
s, while Graphite is available for
Microsoft Windows Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Microsoft. It is grouped into families and subfamilies that cater to particular sec ...
and
Linux Linux ( ) is a family of open source Unix-like operating systems based on the Linux kernel, an kernel (operating system), operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically package manager, pac ...
-based systems. The
OpenType OpenType is a format for scalable computer fonts. Derived from TrueType, it retains TrueType's basic structure but adds many intricate data structures for describing typographic behavior. OpenType is a registered trademark of Microsoft Corpora ...
format is primarily intended for systems using the first approach (layout knowledge in the renderer, not the font), but it has a few features that assist with CTL, such as contextual ligatures. AAT and Graphite instructions can be embedded in OpenType font files.


See also

*
Typography Typography is the art and technique of Typesetting, arranging type to make written language legibility, legible, readability, readable and beauty, appealing when displayed. The arrangement of type involves selecting typefaces, Point (typogra ...
*
Unicode Unicode or ''The Unicode Standard'' or TUS is a character encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 16.0 defines 154,998 Char ...
* Writing systems which require complex text layout: **
Arabic alphabet The Arabic alphabet, or the Arabic abjad, is the Arabic script as specifically codified for writing the Arabic language. It is a unicase, unicameral script written from right-to-left in a cursive style, and includes 28 letters, of which most ...
** Most of the Brahmic family of scripts **
N'Ko script NKo (ߒߞߏ), also spelled N'Ko, is an alphabetic script devised by Solomana Kante, Solomana Kanté in 1949, as a modern writing system for the Manding languages of West Africa. The term ''NKo'', which means ''I say'' in all Manding languages, i ...
**
Tengwar The Tengwar () script is an artificial script, one of Tolkien's scripts, several scripts created by J. R. R. Tolkien, the author of ''The Lord of the Rings''. Within the context of Tolkien's fictional world, the Tengwar were invented by the ...
(diacritics and numbers)


References

{{Reflist


External links


Examples of complex rendering
SIL international SIL Global (formerly known as the Summer Institute of Linguistics International) is an evangelical Christian nonprofit organization whose main purpose is to study, develop and document languages, especially those that are lesser-known, to expan ...
's examples of complex writing systems around the world
Complex Text Layout
The Open Group The Open Group is a global consortium that seeks to "enable the achievement of business objectives" by developing " open, vendor-neutral technology standards and certifications." It has 900+ member organizations and provides a number of services ...
's Desktop Technologies
Supporting Indic Scripts in Mozilla
— also other CTL scripts
Project SILA
Graphite Graphite () is a Crystallinity, crystalline allotrope (form) of the element carbon. It consists of many stacked Layered materials, layers of graphene, typically in excess of hundreds of layers. Graphite occurs naturally and is the most stable ...
and
Mozilla Mozilla is a free software community founded in 1998 by members of Netscape. The Mozilla community uses, develops, publishes and supports Mozilla products, thereby promoting free software and open standards. The community is supported institution ...
integration project
CTL Architecture in Solaris
— Solaris Globalization Whitepapers
Complex Scripts
— Microsoft Global Development and Computing Portal
Theppitak's Homepage
— information about Thai language processing
HarfBuzz's page
at Freedesktop.org
D-Type Unicode Text Module — Portable software library for complex text

BidiRenderer
— An application that illustrates the shaping and layout of complex text in bidirectional paragraphs using FriBidi, FreeType, and HarfBuzz
Tehreer-Android
— A library that gives full control over text related technologies such as bidirectional algorithm, open type shaping, text typesetting and text rendering
Tehreer-Cocoa
— Standalone font/text engine for iOS Typesetting Indic computing Natural language and computing