16 October 2009

Linguistic Factoid No. 15: Language Families

There are about 6,000 languages around the world. Some of these are spoken by a huge amount of people, while some of these are spoken by just a small handful of people that are predicted to die within the next century. These languages are in a way related to each other, and if one had the time and effort, then one can simply sit down and figure out how languages relate to each other and which languages are close to each other or not.

Once upon a time, a linguist realized that English and Sanskrit were very much related. There were similar words across the two languages, and upon further scrutiny, it was found that English and Sanskrit are related. Thus, the world realized that there is one such language family called Indo-European. This language family extends from the Celtic languages in Ireland and Great Britain to the languages found in the Indian sub-continent.

How does one establish a language family? Well, people can examine how similar or different each languages' vocabulary is. People can examine a core vocabulary and see how much they vary. The words in this core vocabulary should be words that are thought to be highly resistant to borrowing, such as kin terms and body parts. One can also examine the syntax or word order of the languages. The more similar things are, the easier it is for people to establish the existence of a common language family across the languages.

So, what is the largest language family? Well, there is this language family called Niger-Congo, found in sub-Saharan Africa. This language family contains about 1,400 languages, extending from Senegal to South Africa. The second-largest language family is the Austronesian language family, which is posited to have originated in the island of Taiwan, and is now found along a wide swath of area, from Madagascar in the Indian Ocean to Easter Island in the South Pacific. There are 1,200 languages in this family, but the area coverage is very large.

Of course, there are language families that are smaller than these two. And there are also languages that do not belong to any language family. Theorists have claimed that Japanese is one such language, but this is not firmly established yet. Other languages, such as Basque (found in Spain and France) and Burushaski (found in Pakistan) are clearer to be language isolates, since they are clearly very unrelated to the languages that surround them.

  1. Very interesting factoids! To what language family does Filipino/Tagalog belong to?

  2. Toe,

    Tagalog belongs to Austronesian. The Austronesian language family has several branches, and within the family, there are plenty of variations. The branch that covers much of the Philippine languages are known for being verb-initial in word order, but this is not true for other Austronesian languages. Malay and Indonesian for example are verb-medial languages. What is common to the whole family however is the fact that there are multiple voice constructions.

  3. To me, Japanese doesn't seem that unique. The writing is borrowed from the Chinese (kanji) and I can read and understand some words thanks to my Mandarin.

    Interesting fact about the language family, wouldn't have guessed the largest was in Africa.

  4. Zhu,

    Ah, but classifying a language family has nothing to do with the script at all. True, the Japanese borrowed Chinese characters (that's why I can "read" Chinese since I know what the characters mean), but that's just their writing system. It would be equivalent to saying that all languages that use the Roman alphabet are in one language family.

  5. Very interesting. I didn't know about the African language group either. How many of those languages still survive, and did they have an advanced grammar/writing? Time to go to wikipedia! :)

  6. Priyank,

    Well, here's a better site that would help you: Ethnologue. Click the link for Niger-Congo, which is the largest language family on earth, and you can see its distribution, how many speakers per language, and where they are located.