Okapi Framework
   HOME

TheInfoList



OR:

The Okapi Framework is a cross-platform and open-source set of components and applications that offer extensive support for localizing and translating documentation and software.


Architecture

The Okapi Framework is organized around the following parts: * Interface Specifications — The framework's components and applications communicate through several common API sets: the interfaces. A few of them are defined as high-level specifications. Implementing these interfaces allows you to seamlessly plug new components in the overall framework. For example: all filters have the same API to parse input files, so you can write utilities that use any of the available filters. * Format Specifications — Storing and exchanging data is an important part of the localization process. Using open standards for as many formats as possible increases interoperability. Whenever possible the Okapi Framework make use of existing standards such as XLIFF, SRX, TMX, etc. * Components — The Okapi Framework also includes a growing set of components that implement the different interface specifications. Some are basic and low-level parts that can be re-used when programming more high-level components, while others are plug-ins that can be used directly in scripts or applications. * Applications — Lastly, the framework also provides end-user applications that can be utilized out-of-the-box. These tools are making use of the Okapi components and provide ready-made platforms for plugging in your own components.


Components

There are two main types of components: * Filters — Several filters components are implemented, including for: HTML, OpenOffice.org, Microsoft Office files, Java properties files, .NET ResX files, Table-type files (e.g. CSV), Gettext PO files, XLIFF, SDLXLIFF, TMX, Qt TS files, regular-expression-based formats,
XML Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. T ...
format (including support of the
Internationalization Tag Set The Internationalization Tag Set (ITS) is a set of attributes and elements designed to provide internationalization and localization support in XML documents. The ITS specification identifies concepts (called "ITS data categories") which are impor ...
)
IDML (InDesign Markup Language)
etc. * Utilities — Several utilities components are implemented, including: Text extraction and merging, RTF to text conversion, encoding conversion, line-break conversion,
term extraction Terminology extraction (also known as term extraction, glossary extraction, term recognition, or terminology mining) is a subtask of information extraction. The goal of terminology extraction is to automatically extract relevant terms from a give ...
, translation comparison, quality check,
pseudo-translation Pseudolocalization (or pseudo-localization) is a software testing method used for testing internationalization aspects of software. Instead of translating the text of the software into a foreign language, as in the process of localization, the text ...
, text re-writing, etc.


Applications

Some of the applications using the framework are: * Rainbow — a toolbox to launch a large variety of localization tasks. * Tikal — a command-line tool for basic localization tasks. * Ratel — a WYSIWYG editor to create, test and maintain SRX segmentation rules. * CheckMate — an application to perform quality checks on bilingual files. * Longhorn — a batch processing server. * Ocelot — a specialized XLIFF editor for review and linguistic QA tasks.


License

All the materials developed under the Okapi Framework project are licensed under the Apache License version 2.0. It was previously released under GNU Lesser General Public License up to M32.


External links


Okapi Framework home page

Okapi Framework Wiki

Distribution downloads

Source repository

Screen shots

Okapi Tools users group

Okapi Framework on OpenHub (statistics)
Software-localization tools Computer-assisted translation software programmed in Java