From Wikipedia, the free encyclopedia - View original article
|This article needs additional citations for verification. (April 2010)|
|Standard||Name (Codes for the representation of names of languages – ...)||First edition||Current||No. in list|
|ISO 639-1||Part 1: Alpha-2 code||1967 (as ISO 639)||2002||184|
|ISO 639-2||Part 2: Alpha-3 code||1998||1998||>450|
|ISO 639-3||Part 3: Alpha-3 code for comprehensive coverage of languages||2007||2007||7704 + local range|
|ISO 639-4||Part 4: Implementation guidelines and general principles for language coding||2010-07-16||2010-07-16||(not a list)|
|ISO 639-5||Part 5: Alpha-3 code for language families and groups||2008-05-15||2008-05-15||114|
|ISO 639-6||Part 6: Alpha-4 representation for comprehensive coverage of language variants||2009-11-17||2009-11-17||21,000+|
Each part of the standard is maintained by a maintenance agency, which adds codes and changes the status of codes when needed.
Types (for individual languages):
Bibliographic and terminology codes
|This section provides insufficient context for those unfamiliar with the subject. (October 2009)|
The first four columns contain codes for a representative of a specific type of relation between the parts of ISO 639. E.g. there are four elements that have a code in part 1, have a B/T code, and are macrolanguages per part 3. One representative of these four elements is "Persian" [fas].
|ISO 639-1||ISO 639-2||ISO 639-3||ISO 639-5||#||Description of example|
|en||eng||eng||(-)||132||185 in Part 1, subtract all special cases for Part 1 codes, 185-2-25-17-4-2-1-1-1=132|
|nb||nob||nob||(-)||2||individual language, belongs to macrolanguage (nor), same code in Part 2 and has a code in Part 1. The two codes are: nob, non|
|ar||ara||ara (M)||(-)||25||Part 3 macro, 55 macro total, subtract special cases, 55-24-4-1-1=25|
|de||ger/deu (B/T)||deu||(-)||15||22 elements where B and T differ. Subtract special cases, 22-1-4-2=15.|
|cs||cze/ces (B/T)||ces||(-)||1||Element with differing B/T code and the letters from the Part 1 code are not the first two letters of the T code.|
|fa||per/fas (B/T)||fas (M)||(-)||4||Part 3 macro; the four T codes are: fas, msa, sqi, zho|
|hr||scr/hrv (B/T)||hrv||(-)||2||Part 2 B deprecated, the two T codes are: hrv, srp. Deprecated 2008-06-28.|
|no ("M")||nor ("M")||nor (M)||(-)||1||Part 3 macro and containing languages have codes in Part 1, nor: non, nob; no: nn, nb|
|bh||bih||(-)||?||1||Bihari (bih) is marked as collective despite having an ISO 639-1 code which should only be for individual languages. The reason is that some individual Bihari languages received an ISO 639-2 code, which makes Bihari a language family for the purposes of ISO 639-2, but a single language for the purposes of ISO 639-1. The single are: bho, mai, mag|
|sh||(-)||hbs (M)||(-)||1||Part 3 macro, ISO 639-1 code deprecated, no part 2 code|
|(bh)||bho||bho||(-)||3||individual language code in Part 2 + 3, belongs not to a macrolanguage, in Part 1 covered by a code which has equivalent in Part 2 which is a collective. The three codes are: bho, mai, mag|
|(bh)||(bih)||sck||(-)||individual language no code in Part 2, belongs not to a macrolanguage, in Part 1 covered by a code which has equivalent in Part 2 which is a collective.|
|(-)||car||car||car||individual language in Part 2 and Part 3, but also included in Part 5 as a family|
|(-)||ast||ast||(-)||individual language in Part 2 and Part 3, no code in Part 1|
|(-)||bal||bal (M)||(-)||24||individual language in Part 2 and macro in Part 3, no code in Part 1|
|(-)||mis||mis||?||1||special code: uncoded language|
|(-)||mul||mul||?||1||special code: multilingual content|
|(-)||und||und||?||1||special code: undetermined|
|(-)||zxx||zxxhf||?||1||special code: added 2006-01-11 to declare the absence of linguistic information|
|(-)||qaa||qaa||?||520||reserved for local use, range is qaa ... qtz|
|(-)||aus||(-)||aus||regular group in Part 2|
|(-)||afa||(-)||afa||In Part 2 a rest group, i.e. same code but different languages included. In Part 2 "afa" refers to an Afro-Asiatic language that does not have an individual-language identifier in Part 2, and that does not fall into the rest groups "ber - Berber (Other)", "cus - Cushitic (Other)", or "sem - Semitic (Other)", all of which are Afro-Asiatic language groups.|
|(ar)||(ara "M")||arb||(-)||individual language, belongs to macrolanguage (ara), in Part 2 covered by the macrolanguage code, in Part 1 also covered|
|(-)||(nic "R")||aaa||(-)||in Part 2 best covered by a rest group, "Niger-Kodofanian (Other)"|
|(-)||(-)||(-)||sqj||group not coded in Part 2|
"Alpha-2" codes (for codes composed of 2 letters of the ISO basic Latin alphabet) are used in ISO 639-1. When codes for a wider range of languages were desired, more than 2 letter combinations could cover (a maximum of 262 = 676), ISO 639-2 was developed using Alpha-3 codes (though the latter was formally published first).
"Alpha-3" codes (for codes composed of 3 letters of the ISO basic Latin alphabet) are used in ISO 639-2, ISO 639-3, and ISO 639-5. The number of languages and language groups that can be so represented is 263 = 17,576.
The common use of Alpha-3 codes by three parts of ISO 639 requires some coordination within a larger system.
Part 2 defines four special codes
zxx, a reserved range
qaa-qtz (20 × 26 = 520 codes) and has 23 double entries (the B/T codes). This sums up to 520 + 23 + 4 = 547 codes that cannot be used in part 3 to represent languages or in part 5 to represent language families or groups. The remainder is 17,576 – 547 = 17,029.
There are somewhere around six or seven thousand languages on Earth today. So those 17,029 codes are adequate to assign a unique code to each language, although some languages may end up with arbitrary codes that sound nothing like the traditional name(s) of that language.
"Alpha-4" codes (for codes composed of 4 letters of the ISO basic Latin alphabet) is proposed to be used in ISO 639-6. The upper limit for the number of languages and dialects that can be represented is 264 = 456,976.