Posted on Wednesday, March 20, 02013 by Karin Wiecha
The Ethnologue is a comprehensive language catalogue which is used as a reference work by linguists all over the world. It was published for the first time in 1951 by The Summer Institute of Linguistics (SIL) and provides information for all known living languages and languages that have become extinct after 1951. The Ethnologue provides statistical data on the world's languages including native speaker populations, literacy rates, regions where the languages are spoken, an assessment of their vitality and other basic information. This data is very useful as a reference point for language projects of all kinds. The set of data as a whole is important infrastructure that is also used by the Rosetta Project. Some of the language metadata in the Rosetta Collection at the Internet Archive, like the three-letter language identifier codes, are taken from the Ethnologue. The 17th edition of the Ethnologue has just been released online where it is browsable not only for linguists and researchers but for anyone interested in the languages of the world.
The Ethnologue is updated with a new edition approximately every four years to represent our best knowledge about the languages of the world. Altogether the new edition features nearly 60,000 updates and corrections and with each new edition the database is not only updated but also expanded. The 17th edition provides statistics for 7,105 known languages, adding 196 languages to the previous edition. Still this huge database makes no claims of completeness.
Where do all these new languages come from? Determining what constitutes a distinct language is not a straightforward task. Sometimes what we thought were dialects of a single language might get reclassified as separate languages, if it turns out that they are not mutually intelligible. Cultural identities and politics can also occasionally play a role in deciding where to draw the line. Determining whether a language is extinct can be an equally difficult task. In the new edition of the Ethnologue 188 languages have been reclassified from extinct to “dormant”, because they still have a symbolic value for their former speech community and offer the potential for revitalization or may be actively being revitalized. From time to time previously unknown languages are also discovered, as in the very recent announcement of Hawai’i Sign Language. Researchers report these findings to the constantly growing database of the Ethnologue.
With the new Ethnologue edition the website was also given a new, more interactive design which allows you to browse languages not only via the search function but also by clicking on a world map. For many countries there are language maps available that show in which regions certain languages are spoken. Two other new features that might be interesting for language enthusiasts are the Ethnoblog and the Language of the Day Feature. Every day a language is highlighted on the website with a link to its individual language page. The language pages provide the most important information on each, individual language, including the language status and its position in the language cloud - two new metrics in this version of the Ethnologue.
The language status is measured with the Expanded Graded Intergenerational Disruption Scale (EGIDS), which assigns each language a level of endangerment between 0 - International (e.g. English) to 10 - Extinct. This scale is an expansion of the eight-level GIDS-scale developed by linguist Joshua Fishman in 1991. GIDS was developed to determine the vitality of endangered languages, while EGIDS is applicable to all languages, including world languages and extinct languages, which makes it possible to assign a status to each language of the Ethnologue’s comprehensive database.
The language cloud is a visualization of the vitality of the world’s languages. It combines the EGIDS scale with the number of first language speakers of a given language to position its status of endangerment with respect to all other languages in the world. Each of the 7,105 languages listed in the Ethnologue is represented by a dot. Languages that have a lot of native speakers and are widely used are positioned in the upper left corner while the languages in the lower right corner are extinct or severely endangered languages with a very small number of speakers if any. Every language page features a version of the language cloud with the language’s individual position highlighted (see image).
Mindiri(a language of Papua New Guinea and Language of the Day for March 20, 02013) in the language cloud
The Ethnologue also provides the ISO-codes for all the listed languages. ISO 639 is an internationally recognized coding system of languages. SIL has been the official Registration Authority of the third and most extensive version of the code set, known as ISO 639-3, since 2007. Language names alone do not suffice as uniquie identifiers for any given language since some languages have multiple names and then again other language names are used for a number of languages. The ISO-codes ensure that every language is identifiable by its individual three-letter code.
An interesting side note: there is an ISO-code for not only each of the known living, but also extinct and constructed languages. Esperanto is an artificial language, but has 2 million speakers world wide according to the Ethnologue. The ISO-code for Esperanto is epo. Klingon, another constructed language, might not have as many speakers, but there is an ISO-code for it: tlh. Old English is not included in the Ethnologue because it died out centuries ago, but it still has an ISO-code (ang).
Do you know the ISO-code for the language or languages you speak? Why don’t you look it up in the new edition of the Ethnologue!