HOME

TheInfoList



OR:

In
writing Writing is the act of creating a persistent representation of language. A writing system includes a particular set of symbols called a ''script'', as well as the rules by which they encode a particular spoken language. Every written language ...
, a space () is a blank area that separates words, sentences, and other written or printed
glyph A glyph ( ) is any kind of purposeful mark. In typography, a glyph is "the specific shape, design, or representation of a character". It is a particular graphical representation, in a particular typeface, of an element of written language. A ...
s (characters). Conventions for spacing vary among languages, and in some languages the spacing rules are complex. Inter-word spaces ease the reader's task of identifying words, and avoid outright ambiguities such as "now here" vs. "nowhere". They also provide convenient guides for where a human or program may start new lines.
Typesetting Typesetting is the composition of text for publication, display, or distribution by means of arranging physical ''type'' (or ''sort'') in mechanical systems or '' glyphs'' in digital systems representing '' characters'' (letters and other ...
can use spaces of varying widths, just as it can use graphic characters of varying widths. Unlike graphic characters, typeset spaces are commonly stretched in order to align text. A
typewriter A typewriter is a Machine, mechanical or electromechanical machine for typing characters. Typically, a typewriter has an array of Button (control), keys, and each one causes a different single character to be produced on paper by striking an i ...
, on the other hand, typically has only one width for all characters, including spaces. Following widespread acceptance of the typewriter, some typewriter conventions influenced
typography Typography is the art and technique of Typesetting, arranging type to make written language legibility, legible, readability, readable and beauty, appealing when displayed. The arrangement of type involves selecting typefaces, Point (typogra ...
and the design of printed works.
Computer A computer is a machine that can be Computer programming, programmed to automatically Execution (computing), carry out sequences of arithmetic or logical operations (''computation''). Modern digital electronic computers can perform generic set ...
representation of text facilitates getting around mechanical and physical limitations such as character widths in at least two ways: *
Character encoding Character encoding is the process of assigning numbers to graphical character (computing), characters, especially the written characters of human language, allowing them to be stored, transmitted, and transformed using computers. The numerical v ...
s such as
Unicode Unicode or ''The Unicode Standard'' or TUS is a character encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 16.0 defines 154,998 Char ...
provide spaces of several widths, which are encoded using distinct numeric
code point A code point, codepoint or code position is a particular position in a Table (database), table, where the position has been assigned a meaning. The table may be one dimensional (a column), two dimensional (like cells in a spreadsheet), three dime ...
s. For example, Unicode U+0020 is the "normal" space character, but U+00A0 adds the meaning that a new line should not be started there, while U+2003 represents a space with a fixed width of one em. Collectively, such characters are called Whitespace characters. * Formatting and drawing languages and software commonly provide much more flexibility in spacing. For example, SVG, PostScript, and countless other languages enable drawing characters at specific (x,y) coordinates on a screen or page. By drawing each word at a specific starting coordinate, such programs need not "draw" spaces at all (this can lead to difficulties in extracting the correct text back out). Similarly, word processors can "fully justify" text, stretching inter-word spaces to make all lines the same length (as can mechanical Linotype machines). Precision is limited by physical capabilities of output devices.


Use in natural languages


Between words

Modern English uses a space to separate words, but not all languages follow this practice. According to Paul Saenger in ''Space Between Words: The Origins of Silent Reading,'' Ancient Hebrew and
Arabic Arabic (, , or , ) is a Central Semitic languages, Central Semitic language of the Afroasiatic languages, Afroasiatic language family spoken primarily in the Arab world. The International Organization for Standardization (ISO) assigns lang ...
did use spaces partly to compensate in clarity for the lack of written vowels when no was used for a vowel, though in the Middle Ages they sometimes omitted spaces when vowel points were marked. The earliest Greek script also used interpuncts to divide words rather than spacing, although this practice was soon displaced by the . In
Latin Latin ( or ) is a classical language belonging to the Italic languages, Italic branch of the Indo-European languages. Latin was originally spoken by the Latins (Italic tribe), Latins in Latium (now known as Lazio), the lower Tiber area aroun ...
, spaces and interpuncts came often to be dropped in favor of , and were not used to separate words again until roughly AD 600–800. Word spacing was later used by Irish and Anglo-Saxon scribes, beginning after the creation of the Carolingian minuscule by Alcuin of York and the scribes’ adoption of it. Spacing would become standard in
Renaissance The Renaissance ( , ) is a Periodization, period of history and a European cultural movement covering the 15th and 16th centuries. It marked the transition from the Middle Ages to modernity and was characterized by an effort to revive and sur ...
Italy and France, and then
Byzantium Byzantium () or Byzantion () was an ancient Greek city in classical antiquity that became known as Constantinople in late antiquity and Istanbul today. The Greek name ''Byzantion'' and its Latinization ''Byzantium'' continued to be used as a n ...
by the end of the 16th century; then entering into the Slavic languages in
Cyrillic The Cyrillic script ( ) is a writing system used for various languages across Eurasia. It is the designated national script in various Slavic, Turkic, Mongolic, Uralic, Caucasian and Iranic-speaking countries in Southeastern Europe, Ea ...
in the 17th century, and only in modern times entering modern
Sanskrit Sanskrit (; stem form ; nominal singular , ,) is a classical language belonging to the Indo-Aryan languages, Indo-Aryan branch of the Indo-European languages. It arose in northwest South Asia after its predecessor languages had Trans-cultural ...
. CJK languages do not use spaces when dealing with text containing mostly
Chinese characters Chinese characters are logographs used Written Chinese, to write the Chinese languages and others from regions historically influenced by Chinese culture. Of the four independently invented writing systems accepted by scholars, they represe ...
and kana. In Japanese, spaces may occasionally be used to separate people’s
family name In many societies, a surname, family name, or last name is the mostly hereditary portion of one's personal name that indicates one's family. It is typically combined with a given name to form the full name of a person, although several give ...
s from
given name A given name (also known as a forename or first name) is the part of a personal name quoted in that identifies a person, potentially with a middle name as well, and differentiates that person from the other members of a group (typically a f ...
s, to denote omitted particles (especially the topic particle ''wa''), and for certain literary or artistic effects. Modern Korean, however, has spaces as an essential part of its writing system (because of Western influence), given the phonetic nature of the
hangul The Korean alphabet is the modern writing system for the Korean language. In North Korea, the alphabet is known as (), and in South Korea, it is known as (). The letters for the five basic consonants reflect the shape of the speech organs ...
script that requires word dividers to avoid ambiguity, as opposed to Chinese characters which are mostly very distinguishable from each other. In Korean, spaces are used to separate chunks of nouns, nouns and particles, adjectives, and verbs; for certain compounds or phrases, spaces may be used or not, as in the phrase for “
Republic of Korea South Korea, officially the Republic of Korea (ROK), is a country in East Asia. It constitutes the southern half of the Korea, Korean Peninsula and borders North Korea along the Korean Demilitarized Zone, with the Yellow Sea to the west and t ...
,” usually spelled without spaces as rather than with a space as . Runic texts use either an
interpunct An interpunct , also known as an interpoint, middle dot, middot, centered dot or centred dot, is a punctuation mark consisting of a vertically centered dot used for interword separation in Classical Latin. ( Word-separating spaces did not appe ...
-like or a colon-like punctuation mark to separate words. There are two
Unicode Unicode or ''The Unicode Standard'' or TUS is a character encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 16.0 defines 154,998 Char ...
characters dedicated for this: and .


Between sentences

Languages with a Latin-derived alphabet have used various methods of sentence spacing since the advent of movable type in the 15th century. * One space (some times called '' French spacing'', ''q.v.''). This is a common convention in most countries that use the
ISO basic Latin alphabet The ISO basic Latin alphabet is an international standard (beginning with ISO/IEC 646) for a Latin-script alphabet that consists of two sets (uppercase and lowercase) of 26 letters, codified in various national and international standards and u ...
for published and final written work, as well as digital (World Wide Web) media.
Web browser A web browser, often shortened to browser, is an application for accessing websites. When a user requests a web page from a particular website, the browser retrieves its files from a web server and then displays the page on the user's scr ...
s usually do not differentiate between single and multiple spaces in source code when displaying text, unless the text is given a "white-space" CSS attribute. Without this being set, collapsing strings of spaces to a single space allow HTML source code to be spaced in a more machine-readable way, at the expense of control over the spacing of the rendered page. * Double space ('' English spacing''). It is sometimes claimed that this convention stems from the use of the
monospaced font A monospaced font, also called a fixed-pitch, fixed-width, or non-proportional font, is a font whose letters and characters each occupy the same amount of horizontal space. This contrasts with Typeface#Proportion, variable-width fonts, where t ...
on
typewriters A typewriter is a Machine, mechanical or electromechanical machine for typing characters. Typically, a typewriter has an array of Button (control), keys, and each one causes a different single character to be produced on paper by striking an i ...
. However, instructions to use more spacing between sentences than words date back centuries, and two spaces on a typewriter was the closest approximation to typesetters' previous rules aimed at improving readability. Wider spacing continued to be used by both typesetters and typists until the
Second World War World War II or the Second World War (1 September 1939 – 2 September 1945) was a World war, global conflict between two coalitions: the Allies of World War II, Allies and the Axis powers. World War II by country, Nearly all of the wo ...
, after which typesetters gradually transitioned to word spacing between sentences in published print, while typists continued the practice of using two spaces. * One widened space, typically one-and-a-third to slightly less than twice as wide as a word space. This spacing was sometimes used in typesetting before the 19th century. It has also been used in other non-typewriter typesetting systems such as the
Linotype machine The Linotype machine ( ) is a "line casting" machine used in printing which is manufactured and sold by the former Mergenthaler Linotype Company and related It was a hot metal typesetting system that cast lines of metal type for one-time use. Li ...
and the
TeX Tex, TeX, TEX, may refer to: People and fictional characters * Tex (nickname), a list of people and fictional characters with the nickname * Tex Earnhardt (1930–2020), U.S. businessman * Joe Tex (1933–1982), stage name of American soul singer ...
system. Modern computer-based digital fonts can adjust the spacing after terminal punctuation as well, creating a
space Space is a three-dimensional continuum containing positions and directions. In classical physics, physical space is often conceived in three linear dimensions. Modern physicists usually consider it, with time, to be part of a boundless ...
slightly wider than a standard word space. There has been some
controversy Controversy (, ) is a state of prolonged public dispute or debate, usually concerning a matter of conflicting opinion or point of view. The word was coined from the Latin '' controversia'', as a composite of ''controversus'' – "turned in an op ...
regarding the proper amount of sentence spacing in typeset material. The ''Elements of Typographic Style'' states that only a single word space is required for sentence spacing. Psychological studies suggest "readers benefit from having two spaces after periods."


Unit symbols and numbers

The International System of Units (SI) prescribes inserting a space between a number and a
unit of measurement A unit of measurement, or unit of measure, is a definite magnitude (mathematics), magnitude of a quantity, defined and adopted by convention or by law, that is used as a standard for measurement of the same kind of quantity. Any other qua ...
(the space being regarded as an implied multiplication sign) but never between a prefix and a base unit; a space (or a multiplication dot) should also be used between units in compound units.. The only exception to this rule is the traditional symbolic notation of
angle In Euclidean geometry, an angle can refer to a number of concepts relating to the intersection of two straight Line (geometry), lines at a Point (geometry), point. Formally, an angle is a figure lying in a Euclidean plane, plane formed by two R ...
s: degree (e.g., 30°), minute of arc (e.g., 22′), and second of arc (e.g., 8″). The SI also prescribes the use of a space (often typographically a thin space) as a
thousands separator alt=Four types of separating decimals: a) 1,234.56. b) 1.234,56. c) 1'234,56. d) ١٬٢٣٤٫٥٦., Both a full_stop.html" ;"title="comma and a full stop">comma and a full stop (or period) are generally accepted decimal separators for interna ...
where required. Both the point and the comma are reserved as decimal markers. Sometimes a narrow non-breaking space or non-breaking space, respectively, is recommended (as in, for example, IEEE Standards and IEC standards) to avoid the separation of units and values or parts of compounds units, due to automatic line wrap and word wrap.


Encoding

Unicode defines many variants of a single whitespace character, with various properties; the more commonly encountered variations include: * * * * In URLs, spaces are percent encoded with its
ASCII ASCII ( ), an acronym for American Standard Code for Information Interchange, is a character encoding standard for representing a particular set of 95 (English language focused) printable character, printable and 33 control character, control c ...
/
UTF-8 UTF-8 is a character encoding standard used for electronic communication. Defined by the Unicode Standard, the name is derived from ''Unicode Transformation Format 8-bit''. Almost every webpage is transmitted as UTF-8. UTF-8 supports all 1,112,0 ...
representation %20.


Types of spaces

* Figure space * Non-breaking space * Thin space * Visible space * *
Zero-width space The zero-width space (rendered: ; HTML entity: or ), abbreviated ZWSP, is a control character, non-printing character used in computerized typesetting to indicate where the word boundaries are, without actually displaying a visible space in the re ...


See also

*
Em (typography) An em (from ''Quad (typography), em quadrat'') is a Typographic unit, unit in the field of typography, equal to the currently specified Point (typography), point size. It corresponds to the Body height (typography), body height of the typeface. F ...
* En (typography) * Halfwidth and fullwidth forms * Internal field separator * Sentence spacing in digital media *
Underscore An underscore or underline is a line drawn under a segment of text. In proofreading, underscoring is a convention that says "set this text in italic type", traditionally used on manuscript or typescript as an instruction to the printer. Its ...
* Whitespace character


References


Further reading

* {{DEFAULTSORT:Space (Punctuation) Control characters Typography Whitespace Writing