ISO-639

From Wikistandards

Jump to: navigation, search

Oooold. Read wikipedia, which has more on this (maybe not the english one) --88.65.182.148 21:49, 12 May 2008 (CEST)

ISO-639 is the standard for languages as recognised by ISO. The main point to these codes is to understand that these codes are particularly used in the meta data about text etc to identify them in an automated environment. There are two accepted versions of the code and one, ISO-639-3 is currently awaiting approval. Hopefully this will happen by August 2006.

Contents

[edit] Alpha-2 code space

"Alpha-2" codes (for codes composed of 2 letters of the basic Latin alphabet) are used in ISO 639-1. Mathematically, the upper limit for the number of languages and language collections that can be so represented is 26 × 26 = 676. This is clearly insufficient to cover all languages, which led to the creation of ISO 639-2 and the use of Alpha-3 codes.

[edit] Alpha-3 code space

"Alpha-3" codes (for codes composed of 3 letters of the basic Latin alphabet) are used in ISO 639-2 and ISO 639-3 and will eventually be used in ISO 639-5. Mathematically, the upper limit for the number of languages and language collections that can be so represented is 26 × 26 × 26 = 17,576.

The common use of Alpha-3 codes by three parts of ISO 639 requires some coordination within a larger system.

Part 2 defines four special codes mul, und, mis, zxx, a reserved range qaa-qtz (20 × 26 = 520 codes) and has 23 double entries (the B/T codes). This sums up to 520 + 23 + 4 = 547 codes that cannot be used in part 3 to represent languages or in part 5 to represent language families or groups. The remainder is 17,576 – 547 = 17,029.

[edit] Alpha-4 code space

"Alpha-4" codes (for codes composed of 4 letters of the basic Latin alphabet) will be used in ISO 639-6. Mathematically, the upper limit for the number of languages and dialects that can be so represented is 26 × 26 × 26 × 26 = 456,976.

[edit] Future

There are several additions planned for this standard.

  • ISO-639-6 This aims to bring all relevant information like language family but also dialects and orthographies in a hierarchical way.

[edit] Usage

All kinds of other standards base the languages they allow on the ISO-639 standards. The result is often pretty ugly; Java for instance allows for the ISO-639-1 codes and when tools are truly about languages like the OmegaT CAT-tool, it breaks much of its functionality.

[edit] Stub

This article is very much a stub. It's intention is to provide a start that is seriously in need for expansion. Please help by expanding this article.

Personal tools