HOME

TheInfoList



OR:

International email arises from the combined provision of ''
internationalized domain names An internationalized domain name (IDN) is an Internet domain name that contains at least one label displayed in software applications, in whole or in part, in non-Latin script or alphabet or in the Latin alphabet-based characters with diacriti ...
'' (IDN) and '' email address internationalization'' (EAI).Started with: The result is email that contains international characters (characters which do not exist in the
ASCII ASCII ( ), an acronym for American Standard Code for Information Interchange, is a character encoding standard for representing a particular set of 95 (English language focused) printable character, printable and 33 control character, control c ...
character set), encoded as
UTF-8 UTF-8 is a character encoding standard used for electronic communication. Defined by the Unicode Standard, the name is derived from ''Unicode Transformation Format 8-bit''. Almost every webpage is transmitted as UTF-8. UTF-8 supports all 1,112,0 ...
, in the email header and in supporting mail transfer protocols. The most significant aspect of this is the allowance of email addresses (also known as email identities) in most of the world's writing systems, at both interface and transport levels.


Email addresses

Traditional email addresses are limited to characters from the
English alphabet Modern English is written with a Latin-script alphabet consisting of 26 Letter (alphabet), letters, with each having both uppercase and lowercase forms. The word ''alphabet'' is a Compound (linguistics), compound of ''alpha'' and ''beta'', t ...
and a few other special characters. The following are valid traditional email addresses: [email protected] (English, ASCII) [email protected] (English, ASCII) user+mailbox/[email protected] (English, ASCII) !#$%&'*+-/=?^_`[email protected] (English, ASCII) "Abc@def"@example.com (English, ASCII) "Fred\ Bloggs"@example.com (English, ASCII) "Joe.\\Blow"@example.com (English, ASCII) A cyrillic languaged people might wish to use i.e. ''бацка.махно'' as their identifier but be forced to use a transcription such as ''[email protected]'' or even some other completely unrelated identifier instead. The same is true of Chinese, Japanese, and other nationalities that do not use
Latin script The Latin script, also known as the Roman script, is a writing system based on the letters of the classical Latin alphabet, derived from a form of the Greek alphabet which was in use in the ancient Greek city of Cumae in Magna Graecia. The Gree ...
s, but also applies to users from non-English-speaking European countries whose desired addresses might contain
diacritic A diacritic (also diacritical mark, diacritical point, diacritical sign, or accent) is a glyph added to a letter or to a basic glyph. The term derives from the Ancient Greek (, "distinguishing"), from (, "to distinguish"). The word ''diacrit ...
s (e.g. ''André'' or ''Płużyna''). As a result, email users are forced to identify themselves using non-native scripts, which may result in errors due to ambiguity of transliteration (for example, иван.сергеев may become ivan.sergeev, ivan.sergeyev, or something else). Alternatively, developers of email systems must compensate for this by converting identifiers from their native scripts to ASCII scripts and back again at the user interface layer. International email, by contrast, uses
Unicode Unicode or ''The Unicode Standard'' or TUS is a character encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 16.0 defines 154,998 Char ...
characters encoded as
UTF-8 UTF-8 is a character encoding standard used for electronic communication. Defined by the Unicode Standard, the name is derived from ''Unicode Transformation Format 8-bit''. Almost every webpage is transmitted as UTF-8. UTF-8 supports all 1,112,0 ...
—allowing for the encoding the text of addresses in most of the world's writing systems. The following are all valid ''international
email address An email address identifies an email box to which messages are delivered. While early messaging systems used a variety of formats for addressing, today, email addresses follow a set of specific rules originally standardized by the Internet Enginee ...
es'': ( Chinese,
Unicode Unicode or ''The Unicode Standard'' or TUS is a character encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 16.0 defines 154,998 Char ...
) ಬೆಂಬಲ@ಡೇಟಾಮೇಲ್.ಭಾರತ (
Kannada Kannada () is a Dravidian language spoken predominantly in the state of Karnataka in southwestern India, and spoken by a minority of the population in all neighbouring states. It has 44 million native speakers, and is additionally a ...
, Unicode) अजय@डाटा.भारत (
Hindi Modern Standard Hindi (, ), commonly referred to as Hindi, is the Standard language, standardised variety of the Hindustani language written in the Devanagari script. It is an official language of India, official language of the Government ...
, Unicode) квіточка@пошта.укр ( Ukrainian, Unicode) χρήστης@παράδειγμα.ελ (
Greek Greek may refer to: Anything of, from, or related to Greece, a country in Southern Europe: *Greeks, an ethnic group *Greek language, a branch of the Indo-European language family **Proto-Greek language, the assumed last common ancestor of all kno ...
, Unicode) Dörte@Sörensen.example.com ( German, Unicode) коля@пример.рф ( Russian, Unicode) مثال@موقع.عر (
Arabic Arabic (, , or , ) is a Central Semitic languages, Central Semitic language of the Afroasiatic languages, Afroasiatic language family spoken primarily in the Arab world. The International Organization for Standardization (ISO) assigns lang ...
, Unicode)


UTF-8 headers

Although the traditional format for email header section allows non-ASCII characters to be included in the value portion of some of the header fields using MIME-encoded words (e.g. in display names or in a ''Subject'' header field), MIME-encoding must not be used to encode other information in a header, such as an email address, or header fields like ''Message-ID'' or ''Received''. Moreover, the MIME-encoding requires extra processing of the header to convert the data to and from its MIME-encoded word representation, and harms readability of a header section. The 2012 standards RFC 6532 and RFC 6531 allow the inclusion of
Unicode Unicode or ''The Unicode Standard'' or TUS is a character encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 16.0 defines 154,998 Char ...
characters in a header content using UTF-8 encoding, and their transmission via SMTP—but in practice support is only slowly rolling out.


Interoperability via downgrading

Domain internationalization works by downgrading. UTF-8 parts, known as U-Labels, are transformed into A-Labels via an ''ad-hoc'' method called IDNA. For example, sörensen.example.com is encoded as xn--srensen-90a.example.com. In 2003, when the need was addressed, that seemed easier than checking that all DNS software could comply with UTF-8 strings, although in theory DNS can transport binary data. This encoding is needed before issuing DNS queries. Since traditional email standards constrain all email header values to ASCII only characters, it is possible that the presence of UTF-8 characters in email headers decreases the stability and reliability of transporting such email. This is because some email servers do not support these characters. Checking compliance with UTF-8 strings must be done
software Software consists of computer programs that instruct the Execution (computing), execution of a computer. Software also includes design documents and specifications. The history of software is closely tied to the development of digital comput ...
package by software package (see #Adoption below.) There was an experimental method proposed by the IETF, by which email could be somehow downgraded into the legacy all-ASCII format which all standard email servers support. This proposal was deemed too cumbersome; the meaning of the left hand side part of an email address is local to the target server, and so there is no way to check whether xn--''something'' is a valid user name, used in some domain. It was later obsoleted in 2012.


Standards framework

The set of Internet RFC documents RFC 6530, RFC 6531, RFC 6532, and RFC 6533, all of them published in February 2012, define mechanisms and protocol extensions needed to fully support internationalized email addresses. These changes include an
SMTP The Simple Mail Transfer Protocol (SMTP) is an Internet standard communication protocol for electronic mail transmission. Mail servers and other message transfer agents use SMTP to send and receive mail messages. User-level email clients typi ...
extension and extension of email header syntax to accommodate UTF-8 data. The document set also includes discussion of key assumptions and issues in deploying fully internationalized email. Unicode also has recommende
Email Security Profiles for Identifiers


Adoption

* 2010-10-29: PRweb
Afilias Afilias, Inc. was a US corporation that was the registry operator of the .info, .mobi and .pro top-level domain, service provider for registry operators of .org, .ngo, .lgbt, .asia, .aero, and a provider of domain name registry services f ...
and .JO Registry Bring Native Language E-mail to Arabic Internet Users * 2013-11-14: The Bat! Email Client implemented support for Internationalized Domain Names (IDN) in email addresses. * 2014-07-15: Postfix mailer started supporting Internationalized Email, also known as EAI or SMTPUTF8, defined in RFC 6530 .. RFC 6533. Initial support was made available with a development version 20140715, and on 2015-02-08 ended up in a stable release 3.0.0. This supports UTF-8 in SMTP or LMTP sender addresses, recipient addresses, and message header values. * 2014-07-19: XgenPlus Email Server started supporting IDN based email, also known as support for SMTPUTF8, especially for .भारत domain. * 2014-08-05: Google announced that Gmail will recognize addresses that contain accented or non-Latin characters, with more support for internationalization to follow. Their mailers (MX MTA) are announcing support for ''SMTP Extension for Internationalized Email'' (SMTPUTF8, RFC 6531). * 2014-09-30: Message Systems announced that their product ''Momentum'' (versions 4.1 and 3.6.5) provides SMTPUTF8 support, the email address internationalization extension to the SMTP protocol, allowing emails to be sent to new, non-western addressed recipients. * 2014-10-22: the version 2.10.0 of Amavis mail content filter was released which added support for SMTPUTF8, EAI, and IDN. * 2016-10-18: Data Xgen Technologies launched a free linguistic email address under the name "DATAMAIL". In support of Digital India this made an Indian email app stop that supports IDN (
internationalized domain name An internationalized domain name (IDN) is an Internet domain name that contains at least one label displayed in software applications, in whole or in part, in non-Latin script or alphabet or in the Latin alphabet-based characters with diacrit ...
) in Hindi (हिन्दी), Gujarati (ગુજરાતી), Urdu (اردو), Punjabi (ਪੰਜਾਬੀ ਦੇ), Tamil (தமிழ்), Telugu (తెలుగు), Bengali (বাংলা), Marathi (मराठी), Latin English. DATAMAIL has launched international languages for the countries using Arabic (العَرَبِيَّة), Russian (Русский) and Chinese (汉语/漢語) as their base language. * 2016-12-07: почта.рус Launches fully Russian (Cyrillic) email in Moscow through a press conference. * 2017-03-07: Apple's
App Store An app store, also called an app marketplace or app catalog, is a type of digital distribution platform for computer software called applications, often in a mobile context. Apps provide a specific set of functions which, by definition, do not i ...
approves publication of an iOS app with EAI support. * 2017-12-03: Chief Minister Vasundhara Raje of Rajasthan launches one Free Email service at rajasthan.in and @राजस्थान.भारत domains. Rajasthan state becomes the World's first state to provide an email address to every citizen in their own language. * 2017-12-27: Microsoft announces coming IDN email support on Office 365 and also announces partner XgenPlus hosting IDN mailboxes. * 2018-01-03: Microsoft Adds E-Mail Internationalization to Exchange Online. * 2018-09-18: Courier-MTA releases support for Unicode E-mail messages, in UTF-8, for all Courier packages. In addition, Courier-IMAP uses Unicode (UTF8) for names of maildir folders. * 2020-07-29: DataMail launched Kannada language email address to break the language barrier The
ICANN The Internet Corporation for Assigned Names and Numbers (ICANN ) is a global multistakeholder group and nonprofit organization headquartered in the United States responsible for coordinating the maintenance and procedures of several dat ...
-sponsore
Universal Acceptance Working Group
is working make EAI accepted in more places and publishes annual reports on acceptance.


See also

*
Internationalized domain name An internationalized domain name (IDN) is an Internet domain name that contains at least one label displayed in software applications, in whole or in part, in non-Latin script or alphabet or in the Latin alphabet-based characters with diacrit ...
* Email Address Internationalization * Unicode and email


References


Bibliography

* * * * * * * * *


External links


EAI Working Group Status Page

Internet Engineering Task Force (IETF)
{{DEFAULTSORT:International E-Mail Email