Chapter Consonant Inventories

by Ian Maddieson

1. Introduction

This chapter and the next few chapters will look at various aspects of the complexity of the sound resources used in the world’s languages and examine how this complexity is distributed geographically. The first aspect to be examined is the size of the set of consonants used in the language, usually referred to as the consonant inventory. This is one element of what is called the phonology of the language.

Values of Map 1A. Consonant Inventories
Go to map
 ValueRepresentation
Small 89
Moderately small 122
Average 201
Moderately large 94
Large 57
Total: 563

It is usually possible to agree for any given language on a set of elements which are considered to be the speech sounds used in that language. The most important consideration in deciding on this set is to find groups of words which sound different from each other by the smallest degree sufficient to make them distinct words of the language. For example, the English one-syllable words pin, tin, kin, fin, thin, sin, shin are part of a set which differ by beginning in different ways, dim, din, ding, did, dig, dish are part of a set which differ by ending in different ways and pin, pen, pan, pun, pain, pine, pawn are part of a set which differ in the middle of the syllable. From a series of such comparisons a list of candidate speech sounds for the language will emerge. Generally the set of those which can appear at the beginnings and ends of syllables will be unlike those which can occur in the middle, hence a distinction is made between consonants (sounds typically occurring at the syllable margins) and vowels (sounds typically occurring in the syllable centers). In this chapter only consonants will be discussed.

Several further decisions must be made, such as which consonants in different positions should be considered to be the same as each other. For example, speakers of English generally consider that words such as pip, tit, kick, bib, did, gig begin and end with the same consonant even though there are some easily recognizable differences between the sounds at the beginning and those at the end. It is also necessary to resolve questions about whether certain beginnings or endings of syllables should be considered to be one sound or a sequence of two or more sounds when analyzed from the point of view of the structure of the particular language. For example, the English word chip begins in a way that is similar to the beginning of tip followed by the beginning of ship (compare saying grey chip and great ship), and the English word quick begins in a way that is similar to the beginning of kick followed by the beginning of wick (compare saying lie quick and like wick). These syllable beginnings would both be noted in a phonetic transcription with two symbols, as /tʃ/ and /kw/ respectively. However, when we consider the possibilities of finding related sequences in English, a difference between the two becomes apparent. Nothing except /t/ can precede /ʃ/ at the beginning of an English syllable, whereas other sounds can precede /w/, as in twin, swim, dwell, thwart. Also several other sounds can follow /k/, as in click, crick, suggesting that /k/ and /w/ in the /kw/ sequence are independent elements. Although words like trip, twin might suggest independence of the parts /t/ and /ʃ/ in chip, the sequences /tw, tr/ are not similar to /tʃ/ in an important way. This is because no English syllable can end with /tw, tr/ (or with /kw, kl, kr/), whereas syllables can end with /tʃ/, as in rich, pitch, kitsch. These considerations suggest that /tʃ/ is behaving like a single consonant in English, whereas /kw/ is a sequence of two separate consonants.

When such decisions have been made, a list of the consonants used in the language can be compiled and the total of distinct ones added up. For English, there is general agreement that the consonant inventory contains 24 consonants, though some linguists might decide there are one or two more or less than this. In the survey of 566 languages reported here a strong effort has been made to apply consistent criteria in determining the consonant inventory size. This sometimes leads to some difference from the conclusions in published descriptions of the languages concerned. For most languages relatively straightforward decisions can be reached, but others are more problematic. A difficult choice often concerns whether to include consonants found only in words borrowed from other languages; generally those sounds introduced just in the last few generations as the result of the spread of world languages such as English, Spanish, Russian, Mandarin, and Modern Standard Arabic have been excluded.

The range of resulting inventories extends from a low of 6 consonants to a high of 122. Rotokas (West Bougainville; Papua New Guinea) has only six consonants. These might be represented in a simplified transcription with the letters /p, t, k, b, d, g/ although the range of pronunciations heard in different word positions covers a considerably wider range of sounds than these letters suggest. !Xóõ (Southern Khoisan; Botswana) has 122 consonants, mainly because it has a very large number of different click sounds with which a word may begin. The more typical consonant inventory size is in the low twenties, with the mean for the 562 languages being 22.7, the modal value 22 and the median 21. Consonant inventories close to this size (22 ± 3) have been categorized as average, and the remainder divided into the categories small (from 6 to 14 consonants), moderately small (15-18), moderately large (26-33), and large (34 or more consonants). As Figure 1 illustrates, the particular cut-off values for the categories were chosen so as to approximate a histogram with a normal distribution, although there are somewhat more languages with inventories smaller than the band defined as “average” than with larger than average inventories.


Figure 1: Histogram of languages in the sample according to categories of consonant inventory size

2. Geographical distribution

Languages with average size consonant inventories are found in most areas of the world, suggesting that this size truly is a representative of something typical for spoken human languages. The languages with larger or smaller inventories on the other hand display quite marked regional disparities in their distribution.

Those with smaller than average consonant inventories predominate in the Pacific region (including New Guinea), in South America and in the eastern part of North America, with particular concentrations of “small” inventories in New Guinea and the Amazon basin. The degree of typological similarity with respect to consonant inventory size between the languages of New Guinea and Australia is intriguing. The received idea is that the population ancestral to speakers of today’s Australian languages reached the continent when New Guinea and Australia were connected by dry land in the now partly-submerged landmass known to geologists as the Sahul shelf. Since the landbridge linking New Guinea and Australia was severed around 7000 years ago, contact between Australian and New Guinea peoples is believed to have been strictly limited except in the immediate region of the Torres Straits. Could this similarity represent the conservation of a trait common to languages spoken long ago when the lands were joined?

Those with larger than average consonant inventories are particularly strongly represented in Africa, especially south of the equator, as well as in an area in the heart of the Eurasian landmass, but are most spectacularly concentrated in the northwest of North America. The languages in this latter area belong to a number of different language families with no demonstrable genealogical relationship, including Eskimo-Aleut, Na-Dene, Salishan, Tsimshianic and Wakashan, among others. There is no evidence that the predominance of large consonant inventories in this area is a consequence of direct borrowing of words between these languages although cultural contacts between the peoples concerned are in many cases intense and deep-rooted. The situation is clearly different in one part of the African zone where large consonant inventories occur. Several Bantu languages (part of the larger Niger-Congo family) in the southern part of the continent, such as Zulu and Yeyi, are known to have enlarged their consonant inventory by borrowing clicks and other sounds which they did not previously use from languages of the Khoisan group, which already had many consonants (see, for example Louw 1975).

3. Theoretical issues

Mapping the size of consonant inventories prepares the way to investigate two connected issues. The first concerns how complexity of different aspects of the sound patterns of languages is related. All human languages are capable of expressing the range of human needs; it might therefore be assumed that they would be similar in their level of complexity. We have seen that by one simple measure of their phonological complexity, the size of the consonant inventory, languages cover quite a wide range. But complexity in one aspect might be balanced out by simplicity in another, so that in aggregate all languages are similarly complex. If this is so, mapping different aspects of phonological complexity should tend to show inverse relationships between one aspect and another in level of complexity. If this is not found, it is reasonable to conclude that languages are not constrained to be similar in this particular way, but that languages with quite different levels of complexity function just as well as each other. Several of the maps that follow will contribute to considering this question, by mapping properties of the vowel inventory (chapter 2), the syllable structure (chapter 12) and the presence and complexity of tone systems (chapter 13).

The second issue concerns the hypothesis that there is an overall relationship between the size of a consonant inventory and the kind of consonants it includes. According to the “size principle” (Lindblom and Maddieson 1988) smaller consonant inventories will tend to contain only those consonants which are in various ways inherently simpler (perhaps because they involve smaller movements to pronounce them, or are easier for a listener to distinguish from other sounds). Consonants which are inherently more complex will be found in larger inventories. If this hypothesis is correct then the geographical distribution of inherently complex consonants should mirror the distribution of larger consonant inventories. In three of the following chapters, 6, 7 and 19, the occurrence of some selected classes of complex consonants will be mapped as a test of this hypothesis as well as for the inherent interest of seeing the distribution concerned.