Theories Explaining the Emergence and Origins of Spoken Language

Critically examine the evidence of and hypotheses about when, where, and, perhaps, how spoken language first emerged in humans

Many linguistics draw varying hypotheses and evidence on the evolution of spoken language. This is due to the fact that language evolution is a very vast and vague concept. There has never been a conclusive assertion of the origin of language which can be universally accepted by linguistics. Each linguist and linguistic researcher provides his/her own unique evidence and attempts to support his/her claim about the emergence of spoken language using geological and ecological evidence. Scientific evidence indicates that language evolution has not stooped; that it is a progressive phenomenon. However, some researchers argue that language evolution has stopped, and that what is being experienced currently is a cultural change which influences language and should not be interpreted as being a language evolution. Although it is believed that language evolution has stopped just as human biological human evolution has stopped, it is undoubtedly true that language evolved somehow. When, where and how language emerged is however vague and many researchers differ in the conceptualization of language origin.

This paper will therefore attempt to analyze and examine critically various hypotheses and evidence about the origin of spoken language; when, where, and perhaps, how it first emerged in humans. Just like biological human beings, spoken language has an origin. Language is complex and it must have originated from an earlier system of pre-linguistic language of the primates and then evolved over the years. Linguistics has attempted to find out this origin but most researchers end up in different explanations. For instance, continuity theories hold that language as a complex phenomenon originated from an earlier pre-linguistic language while discontinuity theories suggest that language must have just suddenly as human beings evolved.

Some theories also view language as a genetically encoded faculty whiles other theories argue that language is culturally determined through social interaction. These theoretical and conceptual differences have resulted in varying hypotheses and evidence concerning the origin of spoken language. Therefore, this paper generally examines these evidence and hypotheses from a neutral ground so as to determine where, when and how spoken language originated from.

Origins of the Spoken Languages

Some researchers have noted that the spoken language first emerged from the primate language. Arcadi (2000) suggests that primates such as apes produce gradable signals that show their emotional states. They often produce vocalizations only in response to their emotional states such as pain and pleasure. Therefore, primates never faked their vocalizations. However, research shows that primates use the same brain regions as humans in speech. As human beings evolved to the early Homo about 2.5-0.8 years ago, the language capabilities improved from those of the ape family.

Biological evolution evidence suggest that the characteristics of bipedalism which emerged from the Australopithecines about 3.5 million years ago possibly change the human skull and led to a more L-shaped vocal tract. This shape alongside the larynx positioning in the neck region led to the sounds made by human beings in their early spoken language (Aronoff and Rees-Miller, 2001). Therefore, this evidence indicates that spoken language first emerged with the early Homo. However, this language also referred to as proto-language was not well developed and was referred to by most researchers as a primitive mode of communication which lacked developed syntax, proper grammar and appropriate vocabulary (Heine and Kuteva, 2007).

This emergence of the first spoken language was a stage in evolution between the primate language and the fully developed human language used up to today. Bickerton (2005) suggests that the earliest spoken language in form of pro-language first emerged with the earliest emergence of Homo. This first appearance of Homo was occasioned by the behavioral adaptations to the niche of scavenging that was experienced by Homo habilis (Bickerton, 2005). It is in this stage that anatomical features such as the L-shaped vocal track developed and human beings began to use the spoken language as a mode of communication.

Bickerton (2005) further provides a good analysis of the brain and the vocal tract as an important aspect in the explanation of human spoken language origin. He suggests that paleoanthropologists have used fossils from anatomy to assess the evidence of language evolution. It has been found out that the neural restructuring of the human brain in early hominids resulted in the development of the first spoken language.

According to Bickerton (2005), Paleoanthropological fossil evidence indicates that the anatomical regions responsible for vocal speech in humans first emerged in the Hominids stage of evolution. It was at this time that the first spoken language emerged. This is clearly evidenced by the Neandertal larynx and Tabun mandible reconstruction which was depicted by the discovery of the Gibraltar skull by Negus in 1949 (Bickerton, 2005). This reconstruction of the Larynx and the Tabun Mandible was linked to the Neandertal which is described as being a morphological intermediary between the gorilla and modern humans. However, critics of this evidence suggest that the drawings of the skull are too schematic and inaccurate which is inconsistent with morphology of fossils. The evidence is therefore viewed as vague and did not appropriately represent the development of human language (Arcadi, 2000).

Speculative Theories

One of the evidence that proves where, how and when spoken language emerged provided by Burling (2005) is based on speculative theories of the historical linguist Max Muller.

One of the theories developed by Muller is Bow-Wow or Cuckoo theory which suggests that early words emerged from human imitation of birds’ and beasts’ cries.

The pooh-pooh theory on the other hand suggests that the first words emerged as exclamations and interjections triggered by emotional reactions such as pleasure, pain and surprise (Burling, 2005).

The speculative Ding-dong theory further holds that spoken language was a result of man’s echoing of natural vibrations.

Yo-he-ho is also another Muller’s speculative theory which claims that spoken language emerged as an earlier human effort to synchronize muscular reactions. As a result, sounds like “heave” alternated with other sounds like “ho”, hence leading to emergence of spoken language.

Finally, Sir Richard Paget added another speculative theory to the above Muller’s list of speculative theories. This theory is the ta-ta theory which claims that the first spoken language emerged as human beings made tongue movements that followed manual gestures, resulting in audible words.

Today, most scholars dispute the above theories by suggesting that they are irrelevant and naïve (Bickerton, 2005). They narrowly relate human perception of sounds with their meanings, hence assuming that spoken language evolved automatically and changed with time. Modern science suggests that symbols and signs are deceptive and unreliable. Reliability concept was therefore ignored by theories such as those of Darwin and Muller. Spoken language is viewed as being easily manipulative and can be faked. As a result, language should carry high levels of mutual trust in order to be well established. Theories about language origins should therefore be able to explain why human beings begin to trust signals while other animals don’t.

The Mother Tongue Hypothesis

To solve the above problem of deception and reliability of spoken language, the “mother tongue” hypothesis was developed in 2004 to explain the origin of language. Fitch (2004) uses the Darwinian kin selection principle to explain the origin of language. Kin selection principle suggests that there exists a convergence of genetic interests among related individuals. According to Fitch (2004), mother tongue evolves as a communication between a mother and her biological offspring. Therefore, interests among relatives tend to coincide due to genetic factors. These interests lead to mutual trust and reduction of unreliability of signals. Spoken language resulting from such signals then become acceptable and trustworthy and begins to evolve for the first time. Critics of this theory suggest that kin selection is not exhibited by human beings only. Botha and Knight (2009) observe that apes and animals also share genes. It is therefore not clear why it is only human beings who speak.

Obligatory Reciprocal Altruism

Another hypothesis that attempts to explain the origin of spoken language is the ‘obligatory reciprocal altruism’ hypothesis. This is drawn from Darwinian principle – reciprocal altruism. Reciprocal altruism holds that there is evolution of language requires a very high level of honesty. This is based on the maxim; ‘if you scratch my back, I will scratch yours’ (Cheney and Seyfarth, 2005). This theory generally asserts that for spoken language to evolve, the early primates should have essentially stuck to honesty and moral regulation.

Criticisms were made against this theory. It is highlighted by some critics that this theory fails to explain the when, where and how obligatory reciprocal autism emerged and was enforced. Knight (2006) also suggests that language does not function on the basis of obligatory altruism. Human beings don’t pass information only to those intended to provide valuable feedback in return but also to anyone else who can listen. This is attributed to the fact that human beings have the tendency of wanting to tell the world that they are able to access socially relevant information.

Gossip Grooming Hypothesis

Another hypothesis aimed at explaining the origin of language is the gossip grooming hypothesis (Corballis, 2002). According to this theory, gossip is used by human beings to serve the same purpose as what manual grooming does for primates.

The theory suggests that spoken language started to evolve when human beings started to group themselves into larger social societies. As human social societies grew larger, manual grooming became unaffordable and time consuming. As a result, human beings invented a more effective way of grooming – vocal grooming. In order for one to maintain a good relationship with his/her social circle members, he/she now has to groom them with low-cost words. This involved an effective servicing of many people at the same time while keeping the hands free for other duties. This led to the emergence of spoken language, which was then termed as gossip. This theory has also faced criticisms.

Critics suggest that the cheapness of spoken language should have limited the level of commitment conveyed by the manual grooming that was replaced by the spoken language (Knight, 2010). Other critics also claim that this theory does not explain the transitional elements between the vocal grooming to the cognitive aspect of syntax language.

Ritual/Speech Co-Evolution Theory

Some scholars also use the ritual/speech co-evolution theory to explain when and how spoken language emerged. Enfield (2010) disputes the concept of origin of language. Language is a symbol of human culture and not the perceived separate adaptation (Knight, 2010). Enfield (2010) and Knight (2010) seem to criticize theories which explain language origins independently from human culture. They assert that such theories tend to explain problems without providing solution alternatives to them. Knight (2010) contends that spoken language cannot work without a given system of social institutions. He provides the evidence that apes are not able to communicate in the wild since there is no social mechanism that may drive communication.

This theory explains that spoken words are cheap and highly unreliable unlike primate vocalizations which are costly and very difficult to fake. Therefore, language being part of social convention does not evolve. Instead, its unreliable nature makes it to work only when one is able to establish a good reputation of trustworthiness within a given cultural society in which culture can be identified as a collective endorsement within the society. Therefore, according to this theory the origin of language entails the determination of the origin of human culture as a whole.

The critics of this theory assert that it rejects the existence of language as an aspect of study in the field of natural science (Chomsky, 2005). In his own theory, Chomsky (2011) suggests that language emerged instantly and perfectly in the biological evolution process of human beings. This theory was criticized by Chomsky’s own critics who claim that only non-existing phenomena have the capability of emerging miraculously in such a manner (Knight, 2008).

Gestural Theory

Gestural theory suggests provides evidence to support its claim that human spoken language evolved from gestures. One of the evidence in support of this theory is that gestures and spoken language depend on the same class of neural systems. The theory observes that the cortex regions responsible for both mouth and hand movements are located next to each other, hence their influence yield related actions (Newman, 2002). The theory also provides the evidence that no-human primates can use gestures similar to those of human beings to communicate with each other. For instance, when the apes are begging for something they extend their hands in a similar way as human beings when they are begging. Research has supported the evidence of this theory by finding out that sign and verbal languages depend on similar neurons. For instance, sign language patients share similar disorders with vocal language patients.

Gestural theory further suggests that use of spoken language emerged first when our ancient ancestors started to use more tools in their daily activities, hence making their hands to be so engaged in work that gestures were no longer possible (Corballis, 2002). This led to the need for vocal language to enable human beings to communicate and ended the use of gesturing for the same purpose. The same theory suggests that the need for communication even when the speakers are not seeing each other also led to the emergence of spoken language.

Put the Baby Down

Another theory which explains when and where the use of vocal language emerged is the ‘put the baby down’ theory. This theory suggests that spoken language is dated back to the time of the Hominins (Falk, 2004). In these periods, Hominin mothers and infants triggered a number of sequential events which led to the development of the first spoken languages. The theory holds that unlike the apes, human infants could not cling to their mothers’ backs because they lacked furs. Human mothers could therefore not move with their children but instead put them down. As a result, the children often felt insecure and kept crying, occasioning their mothers to devise mechanisms through which they can assure the babies that they were not abandoned. These adaptive mechanisms involved a combination of facial expressions, touching, body language, tickling, emotional calls and caressing. These activities possibly led to the first emergence of spoken language. Critics of this theory argue that the theory does well to explain the emergence of infant-directed language referred to as motherese; but does not explain further the more difficult problem of emergence of syntax language.


From this analysis, it is clear that many theories and hypothesis have been provided to explain how, where and when spoken language first emerged. Some researchers trace this period to the time of the Hominids or the Homos while others trace it to the time of the Neandertals. Although these studies vary in their approaches, similar, though not perfectly similar results are achieved. These theories differ substantially in conceptualization but all rely on similar theoretical generalization. For instance, this analysis indicates that most scholars, researches and linguists agree that spoken language evolved either concurrently with human evolution or separately as a natural human aspect. In either case, evidence from research indicates that spoken language emerged in a period between the age of the ancient gorillas and the current period of modern humans. at this period there were changes that occasioned the emergence of the first spoken language such as the need for human mothers to abandon their children as they move around, the adaptation of gestures from the apes, the emergence of gossip grooming, the increased social institutions among others.

