In
computing
Computing is any goal-oriented activity requiring, benefiting from, or creating computing machinery. It includes the study and experimentation of algorithmic processes, and development of both hardware and software. Computing has scientific, ...
, gettext is an
internationalization and localization
In computing, internationalization and localization ( American) or internationalisation and localisation (British English), often abbreviated i18n and L10n, are means of adapting computer software to different languages, regional peculiarities an ...
(i18n and l10n) system commonly used for writing multilingual programs on
Unix-like
A Unix-like (sometimes referred to as UN*X or *nix) operating system is one that behaves in a manner similar to a Unix system, although not necessarily conforming to or being certified to any version of the Single UNIX Specification. A Unix-li ...
computer
operating system
An operating system (OS) is system software that manages computer hardware, software resources, and provides common daemon (computing), services for computer programs.
Time-sharing operating systems scheduler (computing), schedule tasks for ef ...
s. One of the main benefits of gettext is that it separates programming from translating. The most commonly used implementation of gettext is GNU gettext, released by the
GNU Project
The GNU Project () is a free software, mass collaboration project announced by Richard Stallman on September 27, 1983. Its goal is to give computer users freedom and control in their use of their computers and Computer hardware, computing devi ...
in 1995. The runtime library is libintl. gettext provides an option to use different strings for any number of
plural forms of nouns, but this feature has no support for
grammatical gender
In linguistics, grammatical gender system is a specific form of noun class system, where nouns are assigned with gender categories that are often not related to their real-world qualities. In languages with grammatical gender, most or all nouns ...
.
History
Initially, POSIX provided no means of localizing messages. Two proposals were raised in the late 1980s, the 1988 Uniforum gettext and the 1989 X/Open catgets (XPG-3 § 5).
Sun Microsystems
Sun Microsystems, Inc. (Sun for short) was an American technology company that sold computers, computer components, software, and information technology services and created the Java programming language, the Solaris operating system, ZFS, ...
implemented the first gettext in 1993. The Unix and POSIX developers never really agreed on what kind of interface to use (the other option is the X/Open catgets), so many
C libraries, including
glibc
The GNU C Library, commonly known as glibc, is the GNU Project's implementation of the C standard library. Despite its name, it now also directly supports C++ (and, indirectly, other programming languages). It was started in the 1980s by ...
, implemented both. , whether gettext should be part of POSIX was still a point of debate in the
Austin Group, despite the fact that its old foe has already fallen out of use. Concerns cited included its dependence on the system-set locale (a
global variable
In computer programming, a global variable is a variable with global scope, meaning that it is visible (hence accessible) throughout the program, unless shadowed. The set of all global variables is known as the ''global environment'' or ''global ...
subject to multithreading problems) and its support for newer C-language extensions involving wide strings.
The
GNU Project
The GNU Project () is a free software, mass collaboration project announced by Richard Stallman on September 27, 1983. Its goal is to give computer users freedom and control in their use of their computers and Computer hardware, computing devi ...
decided that the message-as-key approach of gettext is simpler and more friendly. (Most other systems, including catgets, requires the developer to come up with "key" names for every string.) They released GNU gettext, a
free software
Free software or libre software is computer software distributed under terms that allow users to run the software for any purpose as well as to study, change, and distribute it and any adapted versions. Free software is a matter of liberty, ...
implementation of the system in 1995.
Gettext, GNU or not, has since been ported to many programming languages. The simplicity of po and widespread editor support even lead to its adoption in non-program contexts for text documents or as an intermediate between other localization formats, with converters like po4a (po for anything) and Translate Toolkit emerging to provide such a bridge.
Operation
Programming

The basic interface of gettext is the function, which accepts a
string
String or strings may refer to:
*String (structure), a long flexible structure made from threads twisted together, which is used to tie, bind, or hang other objects
Arts, entertainment, and media Films
* ''Strings'' (1991 film), a Canadian anim ...
that the user will see in the original language, usually English. To save typing time, and to reduce code clutter, this function is commonly
aliased to
_
:
printf(gettext("My name is %s.\n"), my_name);
printf(_("My name is %s.\n"), my_name); // same, but shorter
gettext()
then uses the supplied strings as keys for looking up translations, and will return the original string when no translation is available. This is in contrast to
POSIX
The Portable Operating System Interface (POSIX) is a family of standards specified by the IEEE Computer Society for maintaining compatibility between operating systems. POSIX defines both the system- and user-level application programming inte ...
catgets()
,
AmigaOS
AmigaOS is a family of proprietary native operating systems of the Amiga and AmigaOne personal computers. It was developed first by Commodore International and introduced with the launch of the first Amiga, the Amiga 1000, in 1985. Early versions ...
GetString()
, or
Microsoft Windows LoadString()
where a programmatic ID (often an integer) is used. To handle the case where the same original-language text can have different meanings, gettext has functions like
cgettext()
that accept an additional "context" string.
xgettext
is run on the sources to produce a
.pot
(Portable Object Template) file, which contains a list of all the translatable strings extracted from the sources. Comments starting with
///
are used to give translators hints, although other prefixes are also configurable to further limit the scope. One such common prefix is
TRANSLATORS:
.
For example, an input file with a comment might look like:
/// TRANSLATORS: %s contains the user's name as specified in Preferences
printf(_("My name is %s.\n"), my_name);
xgettext
is run using the command:
xgettext -c /
The resultant .pot file looks like this with the comment (note that xgettext recognizes the string as a
C-language
printf
The printf format string is a control parameter used by a class of functions in the input/output libraries of C and many other programming languages. The string is written in a simple template language: characters are usually copied liter ...
format string):
#. TRANSLATORS: %s contains the user's name as specified in Preferences
#, c-format
#: src/name.c:36
msgid "My name is %s.\n"
msgstr ""
In POSIX
shell script
A shell script is a computer program designed to be run by a Unix shell, a command-line interpreter. The various dialects of shell scripts are considered to be scripting languages. Typical operations performed by shell scripts include file manip ...
, gettext provides a
gettext.sh
library one can include that provides the many same functions gettext provides in similar languages.
GNU bash also has a simplified construct
$"msgid"
for the simple gettext function, although it depends on the C library to provide a
gettext()
function.
Translating
The translator derives a
.po
(Portable Object) file from the template using the
msginit
program, then fills out the translations.
msginit
initializes the translations so, for instance, for a French language translation, the command to run would be:
msginit --locale=fr --input=name.pot
This will create
fr.po
. The translator then edits the resultant file, either by hand or with a translation tool like
Poedit, or
Emacs
Emacs , originally named EMACS (an acronym for "Editor MACroS"), is a family of text editors that are characterized by their extensibility. The manual for the most widely used variant, GNU Emacs, describes it as "the extensible, customizable, ...
with its editing mode for
.po
files. An edited entry will look like:
#: src/name.c:36
msgid "My name is %s.\n"
msgstr "Je m'appelle %s.\n"
Finally, the .po files are compiled with
msgfmt
into binary
.mo
(Machine Object) files. GNU gettext may use its own file name extension
.gmo
on systems with another gettext implementation. These are now ready for distribution with the software package.
GNU
msgfmt
can also perform some checks relevant to the
format string
The printf format string is a control parameter used by a class of functions in the input/output libraries of C and many other programming languages. The string is written in a simple template language: characters are usually copied liter ...
used by the programming language. It also allows for outputting to language-specific formats other than MO; the
X/Open X/Open group (also known as the Open Group for Unix Systems and incorporated in 1987 as X/Open Company, Ltd.) was a consortium founded by several European UNIX systems manufacturers in 1984 to identify and promote open standards in the field of in ...
equivalent is
gencat
.
In later phases of the developmental workflow,
msgmerge
can be used to "update" an old translation to a newer template. There is also
msgunfmt
for reverse-compiling
.mo
files, and many other utilities for batch processing.
Running
The user, on
Unix
Unix (; trademarked as UNIX) is a family of multitasking, multiuser computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, a ...
-type systems, sets the
environment variable
An environment variable is a dynamic-named value that can affect the way running processes will behave on a computer. They are part of the environment in which a process runs. For example, a running process can query the value of the TEMP env ...
LC_MESSAGES
, and the program will display strings in the selected language, if there is an
.mo
file for it.
Users on
GNU variants can also use the environment variable
LANGUAGE
instead. Its main difference from the Unix variable is that it supports multiple languages, separated with a colon, for fallback.
Plural form
The
ngettext()
interface accounts for the count of a noun in the string. As with the convention of
gettext()
, it is often aliased to
N_
in practical use. Consider the code sample:
// parameters: english singular, english plural, integer count
printf(ngettext("%d translated message", "%d translated messages", n), n);
A header in the
""
(empty string) entry of the PO file stores some metadata, one of which is the plural form that the language uses, usually specified using a C-style
ternary operator. Let's say we want to translate for the
Slovene language
Slovene ( or ), or alternatively Slovenian (; or ), is a South Slavic language, a sub-branch that is part of the Balto-Slavic branch of the Indo-European language family. It is spoken by about 2.5 million speakers worldwide (excluding spea ...
:
msgid ""
msgstr ""
"..."
"Language: sl\n"
"Plural-Forms: nplurals=4; plural=(n%1001 ? 1 : n%1002 ? 2 : n%1003 , , n%1004 ? 3 : 0);\n"
Since now there are four plural forms, the final po would look like:
#: src/msgfmt.c:876
#, c-format
msgid "%d translated message"
msgid_plural "%d translated messages"
msgstr "%d prevedenih sporočil"
msgstr "%d prevedeno sporočilo"
msgstr "%d prevedeni sporočili"
msgstr "%d prevedena sporočila"
Reference plural rules for languages are provided by the
Unicode consortium
The Unicode Consortium (legally Unicode, Inc.) is a 501(c)(3) non-profit organization incorporated and based in Mountain View, California. Its primary purpose is to maintain and publish the Unicode Standard which was developed with the intent ...
. msginit also prefills the appropriate rule when creating a file for one specific language.
[
]
Implementations
In addition to C, gettext has the following implementations: C# for both ASP.NET and for WPF, Perl
Perl is a family of two High-level programming language, high-level, General-purpose programming language, general-purpose, Interpreter (computing), interpreted, dynamic programming languages. "Perl" refers to Perl 5, but from 2000 to 2019 it ...
, PHP
PHP is a General-purpose programming language, general-purpose scripting language geared toward web development. It was originally created by Danish-Canadian programmer Rasmus Lerdorf in 1993 and released in 1995. The PHP reference implementati ...
, Python, R, Scala, and Node.js.
GNU gettext has native support for Objective-C, but there is no support for the Swift programming language
Swift is a general-purpose programming language, general-purpose, multi-paradigm programming language, multi-paradigm, compiled language, compiled programming language developed by Apple Inc. and Open-source-software movement, the open-source ...
yet. A commonly used gettext implementation on these Cocoa platforms is POLocalizedString. The Microsoft Outlook for iOS team also provides a LocalizedStringsKit library with a gettext-like API.
See also
* gtranslator
* Poedit
* Translate Toolkit
The Translate Toolkit is a localization and translation toolkit. It provides a set of tools for working with localization file formats and files that might need localization. The toolkit also provides an API on which to develop other localizatio ...
* Virtaal
* Weblate
References
External links
* {{official website, https://www.gnu.org/software/gettext/gettext.html, Official GNU gettext site
GNU Project software
Internationalization and localization
Software-localization tools