→ Main article: Language origin and development
Language as an "instinct to learn
Already Darwin distinguished between the biological ability of man, which allows him to acquire language, and specific languages as such. This theoretical distinction is adopted by modern cognitive biology. Babies have an instinct to babble, but need to learn language. Thus, for ethologist Peter Marler, as for Darwin, language was not an instinct, but "language is an instinct to learn, the expression of which implies that both biological and external conditions are met." It is to this "instinct" to learn language that the biological evolutionary study of language ability is directed. An important language-related gene discovered in this setting is FOXP2, a phylogenetically ancient transcription factor that plays a role in flexible oral-motor vocal control. FOXP2 underwent a crucial mutation in the human genus at least 400,000 years ago, which is inferred from Neanderthal having the same allele. A set of four characteristic genes has been identified for simple aspects of syntax.
Anatomy
Until around 2010, the prevailing view on the evolutionary language ability of humans was that anatomically modern humans (Homo sapiens) differed from apes in their ability to speak. According to this view, language rich in variation only became possible through anatomical changes in the course of the phylogeny of humans. It is unknown how developed the ability to speak was in the common ancestor of Neanderthal man and Homo sapiens, Homo erectus. Likewise, how "advanced" the morphological and functional potential for differentiated linguistic communication was in the transition from Homo erectus to early anatomically modern humans is unknown. The enlargement of the pharyngeal cavity (as a resonating body), the lowering of the larynx, and the bulging of the palate, which had already begun in Homo erectus, were seen as necessitating greater freedom of movement for the tongue. In the interaction of the pharyngeal cavity, the oral and nasal cavities, the soft palate, the lips and the tongue, the fundamental tone produced by the vocal cords can then be modulated into vowels and consonants. Skull findings prove that the arching of the palate and the lowering of the larynx were completed about 100,000 years ago.
In the Kebara Cave near Haifa in Israel, a hyoid bone was found in a skeleton of a Neanderthal man that was about 60,000 years old, suggesting that this man was capable of spoken language. Anthropologists from Durham suggest that Neanderthal ancestors were able to speak more than 300,000 years ago. They compared the size of the "canalis nervi hypoglossi," an opening in the base of the skull, in skulls of modern humans with various fossils. According to these anthropologists, a large hypoglossal nerve is a prerequisite for differentiated speech. Through this opening at the base of the skull runs the nerve by which the brain controls the movement of the tongue. Scientists found that the canalis nervi hypoglossi was similar in size in Neanderthals as it is in modern humans. In pre-humans of the genus Australopithecus, who lived about two million years ago, it is much smaller.
Recent research results show that the lowering of the larynx was not a solely human characteristic, but occurred in the animal kingdom in many cases, for example in red deer or wapiti deer. At the same time, the previously denied dynamics and reconfigurability of the vocal range is now confirmed in empirical studies for animals, for example in many mammals such as dogs, goats, seals, and also in alligators. Because of the phylogenetically different ancestry of the species examples mentioned, it is assumed that the lowering of the larynx was an evolutionarily early feature. The reasons for this may lie, for example in the deer, in sexual selection through lowered vocalization. The ability to learn song is also inherent in birds. These findings imply, first, that the vocal tract was sufficiently flexible for complex language development at any point in primate evolution and, second, that fossil evidence from human ancestors provides little evidence for language ability. The evolutionary prerequisites for language are now seen more in neurological control or mechanisms and less in the anatomy of the vocal tract.
Neural prerequisites
Whereas language used to be treated as a monolithic entity, cognitive biology today breaks down cognitive language prerequisites into separable components and analyzes them comparatively in different animal strains. The following are seen as prerequisites for the evolution of language: social intelligence, imitation, eye contact sensitivity, spatial gaze-following ability, and the Theory of Mind. These mechanisms form core elements of animal social behavior. Our ability to share thoughts socially allows human cultures to accumulate knowledge in ways that would not be possible without language. Preliminary stages of language have been empirically explored in recent years. According to current research, no evolutionary linear higher development of language exists as animal phyla converge with humans. In birds, similar cognitive, presuppositional abilities are seen in tests as in primates.
Protolanguage Models
Language evolution explores models of protolanguages. Protolanguage differs from protolanguage and refers to alternative forms of communication (models) from which the protolanguage, if it existed, could first emerge. Three models are distinguished: the lexical model, the gestural model and the musical model. All models should provide answers to three components of language, signals, syntax and semantics. These components can be considered as key evolutionary innovations that have evolved since humans split from the last common ancestor. The lexical protolanguage contained spoken words. Syntax as an innovation came later, its emergence, especially in terms of multiple semantic hierarchies, is unclear, as are still the cognitive mechanisms to unambiguously interpret word meaning in the context of language when words are ambiguous. Situationally changing alarm calls of the southern green monkey can be considered an example of a primordial state of lexical protolanguage, but the calls are not learned in the sense of language learning. The lexical protolanguage also does not have the property of intention for information transmission. The gestural model assumes that language arose from pointing gestures. Sign language today can be a complete language with syntax and semantics. Great apes are better at pointing gestures than language. The question then becomes why this model was superseded by language. The musical model goes back to Charles Darwin. Darwin assumed that bird songs and language shared a common evolutionary root. Darwin already recognized the multiple components of language. The model is again gaining recognition, but cannot explain the emergence of semantics within melodies. Music with instruments can be traced back about 40,000 years in Homo sapiens. All three of the above models may have analogous or convergent origins; in the first case a protoform arose once, in the second case several times independently.