HOME

TheInfoList



OR:

In
computing Computing is any goal-oriented activity requiring, benefiting from, or creating computer, computing machinery. It includes the study and experimentation of algorithmic processes, and the development of both computer hardware, hardware and softw ...
, a locale is a set of
parameter A parameter (), generally, is any characteristic that can help in defining or classifying a particular system (meaning an event, project, object, situation, etc.). That is, a parameter is an element of a system that is useful, or critical, when ...
s that defines the user's language, region and any special variant preferences that the user wants to see in their
user interface In the industrial design field of human–computer interaction, a user interface (UI) is the space where interactions between humans and machines occur. The goal of this interaction is to allow effective operation and control of the machine fro ...
. Usually a locale identifier consists of at least a language code and a country/region code. Locale is an important aspect of i18n.


General locale settings

These settings usually include the following display (output) format settings: * Number format setting (LC_NUMERIC, C/C++) * Character classification, case conversion settings (LC_CTYPE, C/C++) * Date-time format setting (LC_TIME, C/C++) * String collation setting (LC_COLLATE, C/C++) * Currency format setting (LC_MONETARY, C/C++) * Paper size setting (LC_PAPER, ISO 30112) * Color temperature setting * UI font setting (especially for CJKV language) * Location setting (country or region) * ANSI character set setting (for
Microsoft Windows Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Microsoft. It is grouped into families and subfamilies that cater to particular sec ...
) The locale settings are about formatting output given a locale. So, the time zone information and daylight saving time are not usually part of the locale settings. Less usual is the input format setting, which is mostly defined on a per application basis.


Programming and markup language support

In these environments, * C * C++ * Eiffel *
Java Java is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea (a part of Pacific Ocean) to the north. With a population of 156.9 million people (including Madura) in mid 2024, proje ...
* .NET Framework * REBOL *
Ruby Ruby is a pinkish-red-to-blood-red-colored gemstone, a variety of the mineral corundum ( aluminium oxide). Ruby is one of the most popular traditional jewelry gems and is very durable. Other varieties of gem-quality corundum are called sapph ...
*
Perl Perl is a high-level, general-purpose, interpreted, dynamic programming language. Though Perl is not officially an acronym, there are various backronyms in use, including "Practical Extraction and Reporting Language". Perl was developed ...
* PHP * Python *
XML Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing data. It defines a set of rules for encoding electronic document, documents in a format that is both human-readable and Machine-r ...
* JSP *
JavaScript JavaScript (), often abbreviated as JS, is a programming language and core technology of the World Wide Web, alongside HTML and CSS. Ninety-nine percent of websites use JavaScript on the client side for webpage behavior. Web browsers have ...
and other (nowadays)
Unicode Unicode or ''The Unicode Standard'' or TUS is a character encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 16.0 defines 154,998 Char ...
-based environments, they are defined in a format similar to BCP 47. They are usually defined with just
ISO 639 ISO 639 is a international standard, standard by the International Organization for Standardization (ISO) concerned with representation of languages and language groups. It currently consists of four sets (1-3, 5) of code, named after each part w ...
(language) and
ISO 3166-1 alpha-2 ISO 3166-1 alpha-2 codes are two-letter country codes defined in ISO 3166-1, part of the ISO 3166 standard published by the International Organization for Standardization (ISO), to represent countries, dependent territories, and special ...
(2-letter country) codes.


International standards

In standard C and C++, locale is defined in "categories" of (text collation), (character class), (currency format), (number format), and (time format). The special category can be used to set all locale settings. There is no standard locale names associated with C and C++ standards besides a "minimal locale" name "C", although the POSIX format is a commonly-used baseline.


POSIX platforms

On
POSIX The Portable Operating System Interface (POSIX; ) is a family of standards specified by the IEEE Computer Society for maintaining compatibility between operating systems. POSIX defines application programming interfaces (APIs), along with comm ...
platforms such as
Unix Unix (, ; trademarked as UNIX) is a family of multitasking, multi-user computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, a ...
,
Linux Linux ( ) is a family of open source Unix-like operating systems based on the Linux kernel, an kernel (operating system), operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically package manager, pac ...
and others, locale identifiers are defined in a way similar to the BCP 47 definition of language tags, but the locale variant modifier is defined differently, and the
character set Character encoding is the process of assigning numbers to graphical characters, especially the written characters of human language, allowing them to be stored, transmitted, and transformed using computers. The numerical values that make up a c ...
is optionally included as a part of the identifier. The POSIX or "XPG" format is . (For example,
Australian English Australian English (AusE, AusEng, AuE, AuEng, en-AU) is the set of variety (linguistics), varieties of the English language native to Australia. It is the country's common language and ''de facto'' national language. While Australia has no of ...
using the
UTF-8 UTF-8 is a character encoding standard used for electronic communication. Defined by the Unicode Standard, the name is derived from ''Unicode Transformation Format 8-bit''. Almost every webpage is transmitted as UTF-8. UTF-8 supports all 1,112,0 ...
encoding is .) Separately, ISO/IEC 15897 describes a different form, , though it's highly dubious whether it is used at all. In the next example there is an output of command locale for
Czech language Czech ( ; ), historically known as Bohemian ( ; ), is a West Slavic language of the Czech–Slovak group, written in Latin script. Spoken by over 12 million people including second language speakers, it serves as the official language of the ...
(cs),
Czech Republic The Czech Republic, also known as Czechia, and historically known as Bohemia, is a landlocked country in Central Europe. The country is bordered by Austria to the south, Germany to the west, Poland to the northeast, and Slovakia to the south ...
(CZ) with explicit
UTF-8 UTF-8 is a character encoding standard used for electronic communication. Defined by the Unicode Standard, the name is derived from ''Unicode Transformation Format 8-bit''. Almost every webpage is transmitted as UTF-8. UTF-8 supports all 1,112,0 ...
encoding: $ locale LANG=cs_CZ.UTF-8 LC_CTYPE="cs_CZ.UTF-8" LC_NUMERIC="cs_CZ.UTF-8" LC_TIME="cs_CZ.UTF-8" LC_COLLATE="cs_CZ.UTF-8" LC_MONETARY="cs_CZ.UTF-8" LC_MESSAGES="cs_CZ.UTF-8" LC_PAPER="cs_CZ.UTF-8" LC_NAME="cs_CZ.UTF-8" LC_ADDRESS="cs_CZ.UTF-8" LC_TELEPHONE="cs_CZ.UTF-8" LC_MEASUREMENT="cs_CZ.UTF-8" LC_IDENTIFICATION="cs_CZ.UTF-8" LC_ALL=


Specifics for Microsoft platforms

Windows uses specifi
language
an
territory
strings. The ''locale identifier'' (LCID) for unmanaged code on
Microsoft Windows Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Microsoft. It is grouped into families and subfamilies that cater to particular sec ...
is a number such as 1033 for English (United States), or 2057 for English (United Kingdom), or 1041 for Japanese (Japan). These numbers consist of a language code (lower 10 bits) and a culture code (upper bits), and are therefore often written in
hexadecimal Hexadecimal (also known as base-16 or simply hex) is a Numeral system#Positional systems in detail, positional numeral system that represents numbers using a radix (base) of sixteen. Unlike the decimal system representing numbers using ten symbo ...
notation, such as 0x0409, 0x0809 or 0x0411.
Microsoft Microsoft Corporation is an American multinational corporation and technology company, technology conglomerate headquartered in Redmond, Washington. Founded in 1975, the company became influential in the History of personal computers#The ear ...
is starting to introduce managed code
application programming interface An application programming interface (API) is a connection between computers or between computer programs. It is a type of software Interface (computing), interface, offering a service to other pieces of software. A document or standard that des ...
s (APIs) for .NET that use this format. One of the first to be generally released is a function to mitigate issues with
internationalized domain name An internationalized domain name (IDN) is an Internet domain name that contains at least one label displayed in software applications, in whole or in part, in non-Latin script or alphabet or in the Latin alphabet-based characters with diacrit ...
s, but more are in
Windows Vista Windows Vista is a major release of the Windows NT operating system developed by Microsoft. It was the direct successor to Windows XP, released five years earlier, which was then the longest time span between successive releases of Microsoft W ...
Beta 1. Starting with Windows Vista, new functions that use BCP 47 locale names have been introduced to replace nearly all LCID-based APIs. A POSIX-like locale name format of is available in the UCRT (Universal C Run Time) of Windows 10 and 11.


See also

*
Internationalization and localization In computing, internationalization and localization (American English, American) or internationalisation and localisation (British English, British), often abbreviated i18n and l10n respectively, are means of adapting to different languages, regi ...
*
ISO 639 ISO 639 is a international standard, standard by the International Organization for Standardization (ISO) concerned with representation of languages and language groups. It currently consists of four sets (1-3, 5) of code, named after each part w ...
language codes *
ISO 3166-1 alpha-2 ISO 3166-1 alpha-2 codes are two-letter country codes defined in ISO 3166-1, part of the ISO 3166 standard published by the International Organization for Standardization (ISO), to represent countries, dependent territories, and special ...
region codes * ISO 15924 script codes *
IETF language tag An IETF BCP 47 language tag is a standardized code that is used to identify human languages on the Internet. The tag structure has been standardized by the Internet Engineering Task Force (IETF) in ''Best Current Practice (BCP) 47''; the subtags ...
* C localization functions * CCSID *
Code page In computing, a code page is a character encoding and as such it is a specific association of a set of printable character (computing), characters and control characters with unique numbers. Typically each number represents the binary value in a s ...
*
Common Locale Data Repository The Common Locale Data Repository (CLDR) is a project of the Unicode Consortium to provide locale data in XML format for use in computer applications. CLDR contains locale-specific information that an operating system will typically provide to ...
* Date and time representation by country * AppLocale


References


External links


BCP 47

Language Subtag Registry

Common Locale Data Repository
* {{Javadoc:SE, package=java.util, java/util, Locale
Javadoc Javadoc (also capitalized as JavaDoc or javadoc) is an API documentation generator for the Java programming language. Based on information in Java source code, Javadoc generates documentation formatted as HTML and other formats via extensions. ...
API documentation
Locale and Language information from Microsoft

MS-LCID: Windows Language Code Identifier (LCID) Reference from Microsoft

Microsoft LCID list

Microsoft LCID chart with decimal equivalents





ICU Locale Explorer

Debian Wiki on Locales
* Article

by Nathan C. Myers
locale(7): Description of multi-language support
- Linux man page


Sort order charts for various operating system locales and database collations

NATSPEC Library


* ttp://docs.translatehouse.org/projects/localization-guide/en/latest/guide/start.html?id=guide/start#locales Guides to locales and locale creation on various platforms Unix user management and support-related utilities Unix SUS2008 utilities Internationalization and localization