HOME

TheInfoList



OR:

The Extended Speech Assessment Methods Phonetic Alphabet (X-SAMPA) is a variant of
SAMPA The Speech Assessment Methods Phonetic Alphabet (SAMPA) is a computer-readable phonetic script using 7-bit printable ASCII characters, based on the International Phonetic Alphabet (IPA). It was originally developed in the late 1980s for six Europ ...
developed in 1995 by
John C. Wells John Christopher Wells (born 11 March 1939) is a British phonetician and Esperantist. Wells is a professor emeritus at University College London, where until his retirement in 2006 he held the departmental chair in phonetics. He is known for ...
, professor of
phonetics Phonetics is a branch of linguistics that studies how humans produce and perceive sounds or, in the case of sign languages, the equivalent aspects of sign. Linguists who specialize in studying the physical properties of speech are phoneticians ...
at
University College London University College London (Trade name, branded as UCL) is a Public university, public research university in London, England. It is a Member institutions of the University of London, member institution of the Federal university, federal Uni ...
. It is designed to unify the individual language SAMPA alphabets, and extend SAMPA to cover the entire range of characters in the 1993 version of
International Phonetic Alphabet The International Phonetic Alphabet (IPA) is an alphabetic system of phonetic notation based primarily on the Latin script. It was devised by the International Phonetic Association in the late 19th century as a standard written representation ...
(IPA). The result is a SAMPA-inspired remapping of the IPA into 7-bit
ASCII ASCII ( ), an acronym for American Standard Code for Information Interchange, is a character encoding standard for representing a particular set of 95 (English language focused) printable character, printable and 33 control character, control c ...
. SAMPA was devised as a
hack Hack may refer to: Arts, entertainment, and media Games * Hack (Unix video game), ''Hack'' (Unix video game), a 1984 roguelike video game * .hack (video game series), ''.hack'' (video game series), a series of video games by the multimedia fran ...
to work around the inability of text encodings to represent IPA symbols. Later, as
Unicode Unicode or ''The Unicode Standard'' or TUS is a character encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 16.0 defines 154,998 Char ...
support for IPA symbols became more widespread, the necessity for a separate, computer-readable system for representing the IPA in ASCII decreased. However, X-SAMPA is still useful as the basis for an
input method An input method (or input method editor, commonly abbreviated IME) is an operating system component or program that enables users to generate characters not natively available on their input devices by using sequences of characters (or mouse oper ...
for true IPA.


Summary


Notes

* The IPA symbols that are ordinary lower case letters have the same value in X-SAMPA as they do in the IPA. * X-SAMPA uses
backslash The backslash is a mark used mainly in computing and mathematics. It is the mirror image of the common slash (punctuation), slash . It is a relatively recent mark, first documented in the 1930s. It is sometimes called a hack, whack, Escape c ...
es as modifying suffixes to create new symbols. For example, O is a distinct sound from O\, to which it bears no relation. Such use of the backslash character can be a problem, since many programs interpret it as an
escape character In computing and telecommunications, an escape character is a character that invokes an alternative interpretation on the following characters in a character sequence. An escape character is a particular case of metacharacters. Generally, the ...
for the character following it. For example, such X-SAMPA symbols do not work in
EMU The emu (; ''Dromaius novaehollandiae'') is a species of flightless bird endemism, endemic to Australia, where it is the Tallest extant birds, tallest native bird. It is the only extant taxon, extant member of the genus ''Dromaius'' and the ...
, so backslashes must be replaced with some other symbol (e.g., an
asterisk The asterisk ( ), from Late Latin , from Ancient Greek , , "little star", is a Typography, typographical symbol. It is so called because it resembles a conventional image of a star (heraldry), heraldic star. Computer scientists and Mathematici ...
: '*') when adding phonemic transcription to an EMU speech database. The backslash has no fixed meaning. * X-SAMPA diacritics follow the symbols they modify. Except for ~ for
nasalization In phonetics, nasalization (or nasalisation in British English) is the production of a sound while the velum is lowered, so that some air escapes through the nose during the production of the sound by the mouth. An archetypal nasal sound is . ...
, = for syllabicity, and ` for retroflexion and rhotacization, diacritics are joined to the character with the underscore character _. * The underscore character is also used to encode the IPA tiebar: k_p codes for . * The numbers _1 to _6 are reserved diacritics as shorthand for language-specific tone numbers. * The
IETF language tag An IETF BCP 47 language tag is a standardized code that is used to identify human languages on the Internet. The tag structure has been standardized by the Internet Engineering Task Force (IETF) in ''Best Current Practice (BCP) 47''; the subtags ...
s registry has assigned as the subtag for text transcribed in X-SAMPA.


Lower-case symbols


Diacritics

, , , , , ,
no audible release A stop consonant with no audible release, also known as an unreleased stop, checked stop or an applosive, is a plosive with no release burst: no audible indication of the end of its occlusion (hold). In the International Phonetic Alphabet, lack of ...
, - , ` , , , , , , rhotacization in vowels, retroflexion in consonants (IPA uses separate symbols for consonants, see t` for an example) , - , ~ (or _~) , , , , , ,
nasalization In phonetics, nasalization (or nasalisation in British English) is the production of a sound while the velum is lowered, so that some air escapes through the nose during the production of the sound by the mouth. An archetypal nasal sound is . ...
, - , _A , , , , , , advanced tongue root , - , _a , , , , , , apical , - , _B , , , , , , extra low tone , - , _B_L , , , , , , low rising tone , - , _c , , , , , , less rounded , - , _d , , , , , , dental , - , _e , , , , , , velarized or pharyngealized; also see 5 , - , <F> , , , , , , global fall , - , _F (or _\) , , , , , , falling tone , - , _G , , , , , , velarized , - , _H , , , , , , high tone , - , _H_T , , , , , , high rising tone , - , _h , , , , , , aspirated , - , _j (or ') , , , , , , palatalized , - , _k , , , , , ,
creaky voice In linguistics, creaky voice (sometimes called laryngealisation, pulse phonation, vocal fry, or glottal fry) refers to a low, scratchy sound that occupies the vocal range below the common vocal register. It is a special kind of phonation in which ...
, - , _L , , , , , , low tone , - , _l , , , , , , lateral release , - , _M , , , , , , mid tone , - , _m , , , , , ,
laminal A laminal consonant is a phone (speech sound) produced by obstructing the air passage with the blade of the tongue, the flat top front surface just behind the tip of the tongue, in contact with upper lip, teeth, alveolar ridge, to possibly, ...
, - , _N , , , , , , linguolabial , - , _n , , , , , ,
nasal release In phonetics, a nasal release is the release of a stop consonant into a nasal. Such sounds are transcribed in the International Phonetic Alphabet with superscript nasal letters, for example as in English ''catnip'' . In English words such as ''s ...
, - , _O , , , , , , more rounded , - , _o , , , , , , lowered , - , _q , , , , , , retracted tongue root , - , <R> , , , , , , global rise , - , _R , , , , , , rising tone , - , _R_F , , , , , , rising falling tone , - , _r , , , , , , raised , - , _T , , , , , , extra high tone , - , _t , , , , , ,
breathy voice Breathy voice (also called murmured voice, whispery voice, soughing and susurration) is a phonation in which the vocal folds vibrate, as they do in normal (modal) voicing, but are adjusted to let more air escape which produces a sighing-like s ...
, - , _v , , , , , ,
voiced Voice or voicing is a term used in phonetics and phonology to characterize speech sounds (usually consonants). Speech sounds can be described as either voiceless (otherwise known as ''unvoiced'') or voiced. The term, however, is used to refe ...
, - , _w , , , , , , labialized , - , _X , , , , , , extra-short , - , _x , , , , , , mid-centralized


Charts


Consonants

* Asterisks (*) mark sounds that do not have X-SAMPA symbols. Daggers (†) mark IPA symbols that have recently been added to
Unicode Unicode or ''The Unicode Standard'' or TUS is a character encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 16.0 defines 154,998 Char ...
. Since April 2008, the latter is the case of the labiodental flap, symbolized by a right-hook ''v'' in the IPA: . A convention for the labiodental flap does not yet exist in X-SAMPA.


Vowels


See also

* Comparison of ASCII encodings of the International Phonetic Alphabet *
List of phonetics topics A * Acoustic phonetics * Active articulator * Affricate * Airstream mechanism * Alexander John Ellis * Alexander Melville Bell * Alfred C. Gimson * Allophone * Alveolar approximant () * Alveolar click () * Alveolar consonant * Alveolar e ...
*
SAMPA The Speech Assessment Methods Phonetic Alphabet (SAMPA) is a computer-readable phonetic script using 7-bit printable ASCII characters, based on the International Phonetic Alphabet (IPA). It was originally developed in the late 1980s for six Europ ...
, a language-specific predecessor of X-SAMPA * SAMPA chart for English


References


External links


Computer-coding the IPA: A proposed extension of SAMPA

X-SAMPA to IPA to CXS converter

Web-based translator for X-SAMPA documents.
Produces Unicode text, XML text, PostScript, PDF, or LaTeX TIPA.
Z-SAMPA
a backward-compatible extension of X-SAMPA sometimes used for
conlang A constructed language (shortened to conlang) is a language whose phonology, grammar, orthography, and vocabulary, instead of having developed natural language, naturally, are consciously devised for some purpose, which may include being devise ...
s {{Latin script SAMPA 1995 in computing Writing systems introduced in the 1990s University College London