Word Joiner
   HOME

TheInfoList



OR:

The word joiner (WJ) is a
Unicode Unicode or ''The Unicode Standard'' or TUS is a character encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 16.0 defines 154,998 Char ...
format
character Character or Characters may refer to: Arts, entertainment, and media Literature * ''Character'' (novel), a 1936 Dutch novel by Ferdinand Bordewijk * ''Characters'' (Theophrastus), a classical Greek set of character sketches attributed to Theoph ...
which is used to indicate that
line breaking Text wrapping, also known as line wrapping, word wrapping or line breaking, is breaking a section of text into lines so that it will fit into the available width of a page, window or other display area. In text display, line wrap is continuing on ...
should not occur at its position. It does not affect the formation of
ligature Ligature may refer to: Language * Ligature (writing), a combination of two or more letters into a single symbol (typography and calligraphy) * Ligature (grammar), a morpheme that links two words Medicine * Ligature (medicine), a piece of suture us ...
s or
cursive Cursive (also known as joined-up writing) is any style of penmanship in which characters are written joined in a flowing manner, generally for the purpose of making writing faster, in contrast to block letters. It varies in functionality and m ...
joining and is ignored for the purpose of text segmentation. It is encoded since Unicode version 3.2 (released in 2002) as . The word joiner replaces the ''zero-width no-break space'' (''ZWNBSP'', U+FEFF), as a usage of the no-break space of zero width. The ''ZWNBSP'' is originally and currently used as the
byte order mark The byte-order mark (BOM) is a particular usage of the special Unicode character code, , whose appearance as a magic number at the start of a text stream can signal several things to a program reading the text: * the byte order, or endianness, ...
(BOM) at the start of a file. However, if encountered elsewhere, it should, according to Unicode, be treated as a word joiner, a no-break space of zero width. The deliberate use of U+FEFF for this purpose is deprecated as of Unicode 3.2, with the ''word joiner'' strongly preferred.FAQ - UTF-8, UTF-16, UTF-32 & BOM, ''”What should I do with U+FEFF in the middle of a file?“''


See also

*
Byte order mark The byte-order mark (BOM) is a particular usage of the special Unicode character code, , whose appearance as a magic number at the start of a text stream can signal several things to a program reading the text: * the byte order, or endianness, ...
, which uses (ZWNBSP) character *
Zero-width space The zero-width space (rendered: ; HTML entity: or ), abbreviated ZWSP, is a control character, non-printing character used in computerized typesetting to indicate where the word boundaries are, without actually displaying a visible space in the re ...
*
Zero-width joiner The zero-width joiner (ZWJ, ; rendered: ; HTML entity: or ) is a non-printing character used in the computerized typesetting of writing systems in which the shape or positioning of a grapheme depends on its relation to other graphemes (complex ...
, which in scripts such as Arabic or Indic causes two characters to be shown in a connected form, even if they would otherwise not.


References

Control characters Unicode formatting code points {{Software-eng-stub