Daniel Kies
Department of English
College of DuPageModern English Grammar
English 2126Contact Form Language Development in Children
An infant crying in the night:An infant crying for the light: And with no language but a cry. Tennyson
Current work:
Days remaining this term:
Notes:
Add Note |
Log in? | Privacy | Change Name & Email
INTRODUCTORY1
It's late, and a tired father steps quietly into the baby's room to make sure the child is comfortable in her crib. To the father's surprise, delight and frustration the ten month old baby springs up, eyes sparkling in the night light and says with all the playful enthusiasm only a child can manage "Hi-eee!"
Yet even before the child has uttered that first word, a long process of growth and language development has already started:
- biological growth, necessary for the neuromuscular coordination of the vocal organs,
- social growth, necessary to use language as a tool of social interaction,
- psychological growth, necessary for the child to organize and adapt to the environment, and
- linguistic growth, necessary to master the sounds and their associated meanings in the language.
The process through which children learn their first language has fascinated people for centuries. Campbell and Grieve (1982) write about several gruesome historical episodes in which unethical members of royalty from different cultures around the world exploited their power in vain attempts to discover the origins of language in children (see also Stam 1976).
As unethical (and unscientific) as those studies were, they do bear witness to the enormous desire people have felt to study children's language development as a way to discover and illuminate human nature. In addition to serving as a way in which humans can learn about themselves, a child's language development is also
- a chance to explore the psychology of infancy,
- a chance to learn more of what humans are as adults by learning how children grow,
- a chance to learn about the nature of human communication and language,
- a chance for researchers to see the interaction of biology, sociology, psychology, and linguistics, and
- a chance for parents and educators to enrich the lives of the children they care for.
One of the earliest scientific studies to record the language development of a child was that of a German biologist Tiedemann (1787), who was interested in beginning a collection of data about language development in normal children. Interest in language development intensified with the publication of Darwin's theory of evolution, and Darwin (1877) himself contributed to the study of language development in children, as did another biologist, Taine (1877). When the German physiologist Preyer (1882) published a detailed descriptive work carefully recording the first three years of his son's development, the modern descriptive, scientific study of language development had begun, continuing with important work by Shinn (1893), Sully (1895), Stern (1924) and Leopold (1939-49) up to the current "explosion" of literature in the last twenty years.
This chapter outlines the course of language development in children and discusses several theories that try to explain the facts about child language in a coherent way. The chapter concludes by describing some of the implications of language development for parents, educators, caretakers, and researchers.
PRECONDITIONS FOR LANGUAGE LEARNING
Although children will begin to vocalize and then verbalize at different ages and at different rates, children most children will learn their first language, a highly complex and abstract symbol system, without conscious instruction on the part of their parents or caretakers and without obvious signs of even making the effort, let alone experiencing any difficulty in doing so. However, before learning can begin, children must be ready to learn; that is, they must be biologically, socially, and psychologically mature enough to undertake the task.
Biological Preconditions
Linguists do not agree on exactly how biological factors affect language learning, but most do agree with Lenneberg (1964) that human beings possess a capacity to learn language that is specific to this species and no other. Lenneberg also suggested that language might be expected from the evolutionary process humans have undergone and that the basis for language might be transmitted genetically.
As part of genetically endowed language abilities, Lenneberg (1967) hypothesized a "critical period" during which language learning proceeds with unmatched ease. A child's early years are especially crucial for language development, Lenneberg argued, because that is the period before the two hemispheres of the human brain lateralizes and specializes in function. As partial proof, Lenneberg discussed cases where children in bilingual communities were able to learn two languages, fluently and without obvious signs of effort before the age of (about) twelve, but to learn a second language after the age of twelve becomes enormously difficult for most people.
Exactly which language capacities are genetically given is an open, and hotly debated, question. Some linguists are so impressed by the speed and uniformity with which children all over the world learn the complex and abstract system of language that they are convinced that the parameters of what can be a human language are biologically determined (see for example Lightfoot 1982). McNeill (1970: 2-3) has even argued that the notion of 'sentence' is inherited:The facts of language acquisition could not be as they are unless the concept of a sentence is available to children at the start of their learning. The concept of a sentence is the main guiding principle in a child's attempts to organize and interpret the linguistic evidence that fluent speakers make available to him. These ideas are one part of the 'nativist' position discussed later in this chapter. There are not sufficient data to state conclusively the contribution of biology to human language, but all linguists acknowledge that biology does play some role. As Patterson and Holts write in their chapter about language development in the gorilla Koko, neurologists know that the infant brain is only forty percent developed at birth. The brain will not achieve its final shape for two years, and many interconnections within the brain will not be complete until the child reaches seven years of age. Some neurologists insist, therefore, that the infant who struggles to gurgle and babble is not attempting to articulate speech sounds because the child has not attained enough neuromuscular, biological, maturity to control the vocal organs before the age of six months.
Similarly, many neurolinguists would argue that children's brains are biologically too immature to comprehend several grammatical concepts commonly used in languages around the world. Concepts like plurals, auxiliary verbs, inflectional endings, and temporal words will develop in all languages in stages, stages that reflect the biological maturation of the child's brain. The fact that those stages of language development are identical and predictable in all languages further suggests that there are strong biological preconditions for learning language.
Social Preconditions
All humans (children too) use language as a tool of social interaction. Therefore, children must have the opportunity to interact socially for language to develop adequately. Moskowitz (1978: 122) reported the following story highlighting the importance of social interaction to the development of human speech:A boy with normal hearing but with deaf parents who communicated by the American Sign Language was exposed to television every day so that he would learn English. Because the child was asthmatic and was confined to his home he interacted only with people at home where his family and all their visitors communicated in sign language. By the age of three he was fluent in sign language but neither understood nor spoke English.Despite the fact that the boy was in environment rich with well-formed spoken English, he could not learn English without the chance to interact with real people in that language. Although the television set could offer sufficient examples of English, could tell him stories, or ask him questions, it could not socialize with the boy, could not react to his vocalizations, could not respond to his questions. It seems that in order to learn their first language children must be in an environment that allows them to communicate socially in that language.
Two early studies by McCarthy (1943) and Davis (1937) demonstrated the importance of early and frequent social contact in language learning. In fact, more recent investigations by Cazden (1972) indicated that a very specific type of verbal and social interaction may be necessary for language development.
McCarthy and Davis, in two separate studies conducted in the 1930s, studied differences in language development among twins, children with brothers and sisters who were not twins, and "only" children. They reported that "only" children are more verbal (they spoke more and were more linguistically sophisticated) than both of the other groups, and that children with siblings who are not twins are more verbal than twins, although all groups understood language equally well. The studies seem to suggest that social and linguistic development go hand in hand: children who associate more with people who are least like themselves (adults and non-twin siblings) creating greater social distance will need more elaborately developed ("adult-like") language to communicate.
Cazden's experimental studies provide further evidence of the importance of social interaction of a particular type. Children in Cazden's study were divided into three groups. The first group received forty minutes a day of extensive and deliberate expansions of their telegraphic two and three word sentences. For this group, the adults repeated the child's "simplified" sentence in full sentence form. For example, when the child would utter "All gone milk," the adult might expand the sentence as "Yes, your milk is all gone" or "Adam's milk is all gone." The second group of children received an equal amount of time focusing on the children's language, but instead of expanding the children's sentences into full "adult" sentence patterns, the adults were asked to expand on the children's ideas by continuing the conversation with a related sentence, but not a repetition of the children's sentences. So for the second group, "All gone milk" might be expanded as "Do you want some more?" or "Let's put the milk away then." The third group of children received no special treatment at all.
Cazden discovered that (contrary to most people's expectations) the second group who had received responses to their sentences expanding the ideas and introducing new grammatical elements, extending meanings, and elaborating on relationships between ideas performed better on all measures of language development than the first group who received fully grammatical expansions of their two or three word sentences. Cazden reasoned that "semantic expansion proved to be slightly more helpful than grammatical expansion" (p. 123) for several reasons. In the second group there was a richness of meaning, focusing on the child's ideas rather than on the form of the child's sentences. McNeill (1970) also suggested that the adult attempts to expand the child's telegraphic sentences may have been inaccurate at least part of the time, thereby misleading or interfering with the child's language development. Cazden's research and the other studies mentioned above suggest that there is a social precondition for language learning: children need a social environment in which they have the opportunity to interact meaningfully with their caretakers, who attend more to the children's ideas and less to the children's grammatical structures.
Psychological Preconditions
Many psychologists view language as an intellectual response, and from the point of view of one very influential psychologist (Jean Piaget), intellectual responses are not inherited. Instead, children inherit a tendency to organize their intellectual processes and to adapt to their environment (Ginsburg and Opper 1969: 17). The theoretical framework hypothesized by Piaget suggests that there are two basic inherited psychological tendencies: organization and adaptation. Organization is "the tendency common to all forms of life to integrate structures, which may be physical or psychological, into higher-order systems or structures" (Ginsburg and Opper 1969: 18). The process of organization from the Piagetian point of view explains several facts about language development in children; for example it is common for children to learn a new word ending and then to use that ending too generally, creating words that follow the overall organizational principle the child is using but do not occur in the language the children are learning. Thus, the child learns that -ed added to the end of a verb means 'past time' and overgeneralizes the pattern to create words like putted, catched, or runned. Similarly, the child learns that the /z/ sound occurs frequently on pronouns of possession like his, hers, or ours and overgeneralizes the use of that sound to create a word like mines.
Adaptation to the environment takes place through the two complementary processes of assimilation and accommodation. "Broadly speaking, assimilation describes the capacity of the organism to handle new situations and new problems with its present stock of mechanisms; accommodation describes the process of changing through which the organism becomes able to manage situations that are at first too difficult for it" (Baldwin 1967: 176). Through accommodation the child is able to assimilate increasingly complex and novel situations until an increasingly mature system evolves. Adaptation through those two processes as described by Piaget might explain how new linguistic structures are incorporated into the child's existing organizational pattern for language and how the child revises linguistic systems to fit the new linguistic evidence the child gets by using their language in social interaction. Thus, many psychologists reason that two psychological tendencies in all humans are necessary preconditions for the development of language in children: organization and adaptation.
Even though nowadays it is common to read about the intense debate between those linguists who believe that language is inherited and those psychologists who believe that language is learned, notice here that the different preconditions can be complementary: the biological factors that seems to suggest that humans inherit some specific language learning capacity work nicely alongside the psychological tendencies children bring to learning language in a social environment.
Although this is a chapter about language development, it should be noted that communication (a central force that drives the development of language) begins before children utter their first words. Children employ the face, body movement, cries and other preverbal vocalizations to communicate their needs, desires, and moods to those around them.
The Face
Even very young children find the face fascinating to watch. From at least the age of two weeks they will fixate on a face in preference to other visual stimuli, especially if the face is moving as in speech (Carpenter 1974; Sherrod 1979). In fact, Goren, Sarty, and Wu (1975) claimed that newborn infants only nine minutes old attend more to a moving schematic (cartoon) face than to moving rearranged faces containing the same features in a different orientation; however, Barrera and Maurer (1981a, b) were unable to observe any evidence of selective attention given to the configuration of the face in children less than two months old.
Meltzoff and Moore (1977; 1983a and b) have made even more dramatic claims: they report that infants as young as two or three days old will attempt to imitate the facial gestures made by an adult (for example, lip rounding, tongue protrusion, or mouth opening). Hayes and Watson (1981), however, were unable to replicate those finding. If Meltzoff and Moore are correct, their studies suggest that even very young children have enough facial neuromuscular control to make expressive gestures of attention, as parents have often reported.
Body Movement and Gesture
Also according to Meltzoff and Moore (1977; 1983a, b) newborn children not only imitate facial expressions but will also attempt to imitate rudimentary manual gestures. As the child matures and gains motor control, the twelve month old infant will raise its arms as a gestural request to be picked up, will hold out an open palm as a request to be given an object, wave goodbye, and so on (Escalona 1973; Greenfield and Smith 1976; Clark and Clark 1977; Lock 1978 and 1980).
Bates (1976) makes interesting connections between the development of language and the child's gestural communication. Bates regards gestures like those described above as primitive communicative acts ('speech acts') performing the same social functions of request and assertion that utterances later satisfy. A child pointing with a finger is commonly making an assertion ("There's my ..."), and the open hand is usually a request ("Can I have the ...?"). For Greenfield and Smith (1976), Bates (1976), and Lock (1980), the child learns 'speech acts' at the gestural stage. The child learns that intentions (assertions, requests, commands, etc) can be conveyed as signals, and the child learns to combine signals and use nonverbal vocalizations to communicate a message (a child's whimper + a pointing finger at a cookie = "I want that cookie"). Lock hypothesized that the combinations of preverbal vocalizations and gestures prefigure the coming messages in simple one and two word "sentences." As the child's vocabulary develops, the vocalizations accompanying the gestures are increasingly verbal until the child's language is sufficiently sophisticated to perform speech acts through speech alone. Though one should remember that gestures remain an important part of human communication at all stages of development (McNeill 1985; Perry et al. 1988); as proof, watch the hands of any adult who is asked to give road directions.
A child's first words are often complemented by manual gesture (Leopold 1949; Goldin-Meadow and Morford 1985). Halliday (1975) studied the early language development of his son Nigel, attending closely to the social functions of language and communication. Many of Nigel's first communicative acts were combinations of a gesture and a word, but unlike "bye-bye" while waving, the word and the gesture expressed different meaningful elements of a single message. For example, Nigel would utter his version of the word "star" while shaking his head, meaning "I can't see the star." Similarly, he said "dabi" (his version of the name Dvorak) while beating time with his hand, meaning "I want the Dvorak record on." Only later in Nigel's linguistic development did words combine to express different elements of a single message (such as "more meat," "two book," or "play rao Bartok," meaning "Let's play at being lions with the Bartok record on").
Parents also recognize the communicative value of other body gestures. Parents often identify the flailing of arms as a sign of hunger or discomfort, and an infant's rigid as opposed to limp body is a clear signal of distress to any adult.
Crying, Cooing, and Babbling
The child's cries and other preverbal vocalizations, of course, are among the most obvious early communicative acts: parents report that they are usually able to distinguish between cries of hunger, pain, fatigue, etc. One study (Muller, Hollien, and Murry 1974), however, demonstrated the importance of social context for parents to identify their child's cries accurately. Muller, Hollien, and Murry played tapes for parents removed from context (such as knowing how long it had been from the last feeding), and the parents were not able to identify the causes of the cries. Such results should not be unexpected, however: context is crucial for humans to interpret almost any message accurately. For example, "Fire!" can express quite different messages in a crowded theater or on a battlefield.
In about the child's eighth to twelfth week of life, a vowel-like gurgle or squeal (cooing) occurs regularly when the child is talked to or nodded at. The baby can sustain cooing for fifteen to twenty seconds (Lenneberg 1967). Cooing sounds are generally similar acoustically to vowels produced when the tongue is low and back in the oral cavity and are often produced with 'rounding' (that is, rounding of the lips). Indeed, the term cooing might well arise from the frequent presence of the /u/ vowel sound. The relationship between crying and cooing is not well known, although Kaplan and Kaplan (1971) have observed a transition from full crying through a kind of "fake crying," which temporarily precedes cooing in the second month.
At the age of six months or so, children in all cultures begin to babble, the production of long sequences of consonants and vowels. Though babbling is far from true language, it does resemble language in a number of important aspects:
- It is articulated independently of physical needs or desires like food; it is articulated for its social value (for the baby to draw attention to him- or herself) and for sheer pleasure.
- It seems to have the rudiments of syllable structures found in the adult language; in a babbling sequence like gon-gog for example, one can see some syllable structure.
- In longer babbling sequences, one hears intonational patterns of the adult language.
The resemblance to adult speech stops here since there is no evidence that the child has discovered the meaning potential in his or her speech sounds.
The relationship between babbling and later speech is also not well known. Many psychologists and linguists agree, however, that babbling serves at least two functions (Kaplan and Kaplan 1971).
First, babbling serves as practice for later speech. This is, of course, the most obvious and intuitive explanation since the fine neuromuscular control needed later for speech is extensively practiced in the babbling stage. Indeed, the babbling child produces a lot of sound and a greater variety of sounds than is actually needed in the adult language. For example, American children, like children all over the world, are heard producing clicks and trilled r's when they babble, even though English does not usually use those sounds to make meaning.
Second, babbling (like cooing) provides a social reward. When children babble, their parents attend to them closely and encourage them to continue talking. Cooing and especially babbling are the first experiences a child has with the social rewards of speech. One can see the importance of the social function of babbling in children who have been severely neglected during this stage. Though they begin to babble at the same age as other children, those severely neglected children will stop, if not encouraged by some caretaker, and their language development is usually irreparably damaged (see Fromkin et al. 1974 for an excellent discussion of one sadly neglected child).
One of the many unexplained mysteries of child language development is why babbling occurs at more or less the same time in all children, since simple observational evidence shows that children babble to practice their later speech at very different rates and the encouragement children receive for babbling comes unequally. Some neurologists and psychologists (as mentioned earlier) hypothesize a link between language development and biological maturation. They propose therefore that babbling occurs automatically when the relevant structures in the brain reach a critical level of maturation. If all humans grow at comparably the same rate, then children around the world will begin to babble comparably. In fact, Lenneberg (1967) discovered that babies who were prevented from any vocalizations by disease or medical procedures hindering their vocalizing would begin to babble spontaneously when they reached six months of age and their medical condition improved enough to allow vocalization. Lenneberg concluded that previous practice at vocalization was not necessary for the onset of babbling and that biological maturity was a crucial factor in babbling.
When babbling begins, the nonsense syllables children create develop through a regular progression. Children first produce vowels and then later combine consonants and vowels (In the diagram below, let V stand for any vowel sound and let C stand for any consonant sound). The progression looks like this:
- V (For example, eee, ooo, uuu)
- CV (Children do not produce the VC pattern yet since syllables that end in a consonant are more difficult to articulate. Indeed, some human languages do not use the VC pattern at all as a possible syllable structure, but all languages have the CV pattern. Examples of this pattern include ta, di, da, ma)
- VCV, VC (Children usually develop the VCV pattern before the VC pattern since consonant sounds are easier to articulate between vowels. Consonants are relatively more difficult to articulate at the ends of syllables. Examples of these patterns are idi, aba, um, ab)
- CVCV (This repetitious pattern is called 'reduplication' when the same consonant and vowel sounds are repeated in consecutive syllables, for example baba, gigi, tutu.)
As mentioned above, babbling children produce a wide variety of sounds, not all of which are used in the adult language. A widely held belief is that babbling children produce all the speech sounds, including all the sounds used in the language they hear and all other human languages. However, babbling infants do not articulate all the speech sounds they hear in the adult language. For example, children who hear English around them, seldom (if ever) utter the th sounds the sounds in words like this and three. Likewise, the range of non-English sounds is not as great as it is often claimed, although those babbling children certainly will produce a variety of non-English sounds. Babbling infants in an English speaking environment, therefore, must gradually develop a full range of English sounds and eliminate specific non-English ones (Irwin and Chen 1946; Oller et al. 1976). It does appear, however, that the range of speech sounds produced by infants at the beginning of the babbling stage is the same for babies in all linguistic environments.
There is more to learn about the sounds of language than just the articulation of consonants and vowels. Children must also master the differences in pitch, intonation, and stress that contribute to the meaning of what children say. In the cooing stage, children will often develop characteristic patterns of falling and rising pitch on their vowel-like utterances as in uuu or aaa. By late in the babbling stage, however, children learn intonation patterns that resemble the rising-falling pattern of declaratives and the level-rising pattern of interrogatives in English, for example:
Liz was who (The rise-fall It ate intonation pattern the of English last declaratives) apple.
coming? (The level-rise was intonation pattern he of English that interrogatives) When did Dave say . Babbling children seem to have mastered the intonation contours of the adult language and apply them to their nonsense syllables. Consider these examples from a child at eight months:
tu i ba (Similar to the o ba intonation pattern ma of English declaratives)
ba (Similar to the dow intonation pattern m-mu ga of English interrogatives) Also by the late babbling stage, children will use stress patterns distinctively. Stress is called distinctive in English because it contributes significantly to the meaning of words. The different meanings of some words in English for example are distinguished only by distinctions in stress. The words convict, transfer, and repeat are a few of many words that are used as verbs when stress is on the final syllable (conVICT, transFER, rePEAT) and are used as nouns when stress is on the initial syllable (CONvict, TRANSfer, REpeat). Similarly, babbling children vary stress on their reduplicated nonsense syllables, as in this sequence from that same eight month old child: BAba baBA.
These features of prosody (pitch, intonation, and stress) contribute significantly to the early communicative abilities of children. Several instances have been observed where the "first words" of a child have been identified on the basis of intonation patterns, rather than their segmental features (consonants and vowels). For example, one child used the word buhrahp at ten months as an all purpose word for play: depending on context and intonation, it could mean ball or rattle when uttered with rising intonation or it could mean come close and play with me when uttered with falling intonation. Another child used MAma (with stress on the first syllable) to call for his mother, while maMA (with stress on the second syllable) called his father (Engel 1973). On the basis of evidence like this, some linguists hypothesize that prosodic features are among the first elements of speech to convey meaning.
All children pass through those series of fixed 'stages' as they develop language. The age at which each stage begins varies considerably from one child to the next, but the relative order of the stages remain fixed for all children. The stages are reached in the same order, although the time between stages may be greater for some children than for others. Consequently, it is possible to divide the process of language development into a sequence of approximate phases, remembering always that there is no clear division between stages in real children, that the stages always overlap, and that chronological age of the child is only a very rough guide to the stage language development.
The following diagram (adapted from Lenneberg 1967) is a highly oversimplified chronology of the early communicative acts discussed in this section:
Age Motor Development Vocalization and Communication
Birth Eye movement and facial control Crying; body movements; facial expressions
12 weeks Supports head when prone; weight rests on elbows; hands mostly open; no grasp reflex Diminished crying; smiles and makes vowel-like, pitch modulated gurgling sound (cooing) when spoken to
16 weeks Plays with rattle; head self-supported; tonic neck reflex subsiding Responds to human sounds more definitely; turns head; eyes search for speaker; some occasional chuckling sound
20 weeks Sits with props Vowel-like cooing is interspersed with consonant sounds
6 months Sitting; bends forward and uses hands for support; bears weight when held in standing position, but cannot stand even while holding on; reaches with one hand; no thumb use yet to help grasp; releases toy when given another
Cooing changing into babbling resembling one syllable utterances; neither vowel nor consonant sound reoccur in any fixed pattern; most common utterances sound like ma, mu, da, and di 8 months Stands holding on; grasps using thumb; picks up small objects with thumb and finger tips Reduplication (continuous repetition of a syllable) becomes frequent; intonation patterns become distinct; pitch changes in utterances can signal emotion
10 months Creeps efficiently; takes side-steps holding on; pulls to a standing position Vocalization often mixed with sound-play (gurgling, bubble blowing); appears to imitate sounds unsuccessfully; differentiates between words heard
12 months Walks when held by one hand; walks on hands and feet with knees in air; mouthing of objects almost stopped; seats self on floor More frequent identical repetition of a sequence of sounds; first words (mamma, dadda); understands simple commands (Point to your eyes) Proto-language
Most of what this chapter has described so far might be called prelinguistic, in the sense that most of the communicative acts outlined so far do not evidence any linguistic system through which the child communicates by associating sounds with meaning. Yet, in the discussions of gesture and babbling above, one can see the earliest beginnings of "first words." However, the exact process by which a child moves from gesture and babbling into language is not well known at all. It is known that the appearance of "first words" occurs near the child's first birthday, that babbling activity decreases tremendously at the appearance of first words, that the syllable structure of the first words resembles the syllable structure of babbling, and that children use their first words both as names of individual people and items in their environment (mamma, dada, goggy meaning 'doggy') and as whole questions, statements, or commands (bah'o, one child's pronunciation of 'bottle', was used by her in all three ways 'Where is my bottle?', 'I see my bottle' and 'Get me a bottle').
Halliday (1975) offers an interesting hypothesis about the process through which children develop from a prelinguistic to a truly linguistic system in three phases:
Phase I (Proto-language)
Phase II (Transition)
Phase III (Developing the Adult Language).
That first phase encompasses the earliest signs that the child has both a system for expressing meaning through sound and a communicative function to be served through those sounds. Consider again the examples of buhrahp (= 'ball' or 'rattle' with rising tone) and MAma (= 'mother'). Both buhrahp and MAma are expressions in those children's proto-languages since they are part of a system (buhrahp with falling tone means something else remember), and the expressions serve a communicative function for the child (a request for a toy for example). Halliday describes this proto-language as the child's own creation, a language that is only marginally related to the adult language the child hears around him or her. As the child grows, however, the child's own created language can not serve his/her growing communicative needs, and at that point Phase II and then III the child abandons his/her creation and moves into the adult language.
That move, though, is far from easy. In discussing the tremendous decrease in babbling and the decrease in the child's repertoire of sounds in early speech, Jespersen (1925: 106) characterizes the difference between the easy articulation of babbling and the difficult articulation of speech as the difference between "play" and "plan:"It is strange that among an infant's sounds one can often detect sounds for instance k, g, h, and uvular r which the child will find difficult in producing afterwards when they occur in real words ... The explanation lies probably in the difference between doing a thing in play or without a plan when it is immaterial which movement (sound) is made and doing the same thing in fixed intention when this sound, and this sound only, is required ....(More about Halliday's hypothesis can be found later in the sub-section Social and Communicative Explanations of language development.)
This section moves beyond children's early communicative acts and proto-language to consider how children make meaningful sound with "fixed intention" the development of true language. However, there are several points that one must remember in the sub-sections to follow: first, one must remember that children develop their language at all levels simultaneously. The discussion below might leave the false impression that child develop language piecemeal, concentrating on sounds, then vocabulary, then grammar, etc. That is not true.
Second, one must remember that children learn real language, language that is not broken down into neatly discreet, easy to identify segments, like consonants and vowels, words, and their prefixes and suffixes, phrases, clauses, and sentences. In real language, speech is not individual sounds, but rather a continuous stream of sound, not only in words but also in phrases, clauses, and sentences. (As proof, one should listen to an ordinary conversation in a language that is completely foreign, perhaps Thai, Vietnamese, or Spanish, and try to identify the individual words. One will find this exercise maddeningly frustrating.) The discussion below might foster the common, but erroneous, impression that language is easily and automatically divided into its component parts. That is not true.
Thirdly, one must remember that the categories and concepts linguists use to describe real language (especially children's language) are very indeterminate: for instance, there are no clear boundaries marking where the vowel /u/ ends and the vowel /o/ begins. There are not even clear boundaries in grammar: tired in a sentence like I am tired is both verb-like in that it has a verb ending -ed and is in a verb position in the sentence and also adjective-like in that it can take a modifier like very (I am very tired) as do other adjectives (as in very big) but not verbs (as in very run). A word like tired exists in a grey, fuzzy area between words that function clearly as adjectives and words that function clearly as verbs. The discussion below might foster another common, but erroneous, impression that linguistic concepts are always well defined and unchangeable. That is not true, especially in children's language.
Finally, one must remember that people (children included) understand much more language than they can produce. That is, one's passive comprehension exceeds one's active production. Consider vocabulary, for example. Many people understand a word like bulbous when they hear or read it, though they may never have used (or never will use) the word themselves. The discussion below might leave the impression that children's comprehension of language is as limited as their production. That is not true.
Sound
As children make a deliberate effort to master the sounds and sound patterns of the adult language, Ferguson and Farwell (1975) noted that children work simultaneously at mastering the sound system of the language as a whole and the sound patterns of individual words. At the first word stage, Velten (1943) found that the entire sound inventory of his daughter Joan consisted of /p/, /t/, /f/, /s/, /m/, /n/, /w/, /u/, and /a/. However, there was great variation in the actual articulation of Joan's speech sounds. For example, /a/ had three distinct articulations the vowels of pat, paint, and papa. Also, [b] did occur in Joan's speech, but the sound never was used to distinguish one word as meaningfully different from another (that is, as linguists would say, those sounds were not in contrastive distribution). Thus, for Joan, the utterances [bat] and [pat] would "sound the same" and refer to the same toy. Velten noticed, actually, that the sounds were in complementary distribution: [p] occurred at the end of words and [b] occurred at the beginning or middle of words. The sounds were never used to distinguish one word from another; they worked like distinctly different articulations of /p/.
An explicit hypothesis about the order in which children learn the speech sounds of their first words was developed by Roman Jakobson in 1941 (first published in English as Jakobson 1968). Jakobson proposed several principles governing language development, but the most interesting were the proposals based on universal characteristics of human languages around the world. By ranking which sounds and which characteristics of sounds were most common in the world's languages, Jakobson proposed that children learn their speech sounds by a series of contrasts:
- the first contrast they discover is between consonants and vowels;
- then they contrast consonants created by air flow through the nasal cavity (like /m/) with consonants created by air flow through the oral cavity (/p/);
- they next contrast consonants produced with the lips, labial consonants, (/p/) from those produced at the alveolar ridge the ridge in the top of one's mouth immediately behind the upper teeth (like /t/);
- they then learn to articulate consonants created by completely stopping the flow of air through the oral cavity (stops like /t/) from consonants which restrict the flow of air but do not stop it (fricatives like /s/); and so on until the child learns all the relevant contrasts in the adult language.
Diagrammatically and more fully, Jakobson's proposals look like this:
![]()
And so on through all the distinctive features in the sounds of the adult language. Jakobson postulated that universal contrasts are learned first. After those, contrasts are learned by the laws of "irreversible solidarity;" that is, when one contrast implies the existence of another, the implied contrast is learned first. For example, some languages of the world that have consonants produced in the front of the mouth (like /p/, /t/, /m/) without having consonants produced at the back of the mouth (like /k/, /g/, /h/); however, no language has back consonants without having front consonants. Thus, the presence of back consonants implies the existence of front consonants. Therefore, Jakobson predicted that front consonants should precede back consonants in development. Indeed, the /p/ and /m/ sounds do appear in children's speech before /k/ or /h/. Furthermore, sounds that are relatively rare in the languages of the world should be learned last in the child's development (like the /æ/ vowel in bash, the th sounds in that, or the lateral sounds /r/ and /l/). Again, it is not surprising that one two year old uttered das pwesiden waygan when trying to say "That's President Reagan."
Jakobson's hypothesis also suggests that children will not learn any contrast without first learning all the contrasts that are presupposed by it. The ability to articulate the contrast between a stop /t/ and a fricative /s/ implies the ability to articulate the contrasts between a labial /p/ versus an alveolar /t/, a nasal /m/ versus an oral /p/, and a consonant /p/ versus a vowel /a/. One contrast implies mastering all the others before it.
Jakobson's hypothesis has proven difficult to test. In its large design, the hypothesis seems borne out by observational evidence. However, in its details, Jakobson's hypothesis has not been supported completely. Velten's daughter Joan, for example, did learn the first four contrasts Jakobson predicted, but in a different order (consonants-vowels, labial-alveolar, fricative-stop, and oral-nasal). Remarkably, of the dozen or so contrasts said to be distinctive in English, it is precisely these four that developed first. Like Velten, Braine (1971) also found a different ordering of the contrasts in the development of his son's speech. Jakobson's hypothesis implies that glides (sounds like /w/ and /y/) are learned late, but Leopold's (1947) daughter learned them quite early. Ervin-Tripp (1966: 68-9) summarized the evidence:
- the vowel-consonant contrasts are probably the earliest,
- a stop-continuent contrast (/p/ as opposed to /f/ or /m/) is very early, with the continuent being either a fricative (/f/) or a nasal (/m/),
- at the beginnings of words, stops precede fricatives (a pattern that is in accordance with most languages, where stops occur word initially more often than fricatives do),
- if there are two consonants distinguished only by the place in the mouth where they are articulated, one will be articulated at the lips (labial), the other at the alveolus (alveolar),
- children will learn to contrast consonants by the place in the mouth where they are articulated (labials vs. alveolar) before they learn to contrast consonants by 'voicing' (the vibration of the vocal cords that produce the distinctive difference in sound between /p/ and /b/),
- children will learn stops (/p/, /t/, /k/ for example) and fricatives (/f/, /s/, and the sh sound for example) before they will learn 'liquids' (like /l/ and /r/) or 'affricates' (consonants that combine the articulation of stops and fricatives like the ch and /j/ sounds in church and judge),
- in Russian and French, children learn /l/ before /r/,
- children learn to contrast vowels made high and low in the oral cavity (like /i/ and /a/) before they learn to contrast vowels that are articulated in the front of the mouth versus the back of the mouth (like /i/ and /o/),
- children will learn oral vowels before nasalized vowels,
- children learn consonant clusters or blends late, and
- children learn to distinguish different consonants first at the beginning of words, then in the middle of words, and then at the end of words.One should also note that diphthongs (vowels articulated by beginning with one vowel sound moving the tongue to another position to end with a second vowel sound like the oy in boy) are rarely used by young children.
As one can see, most of Ervin-Tripp's observations support Jakobson's hypothesis. However, Peng (1984) offers one of the most critical assessments of Jakobson's hypothesis. Peng shows that many of the factual errors noted above can be traced to a large number of unjustified assumptions Jakobson makes about infants' speech sounds and to problems in his method of study. Peng points out, after all, that Jakobson's research in 1941 concentrated on universal traits of language, not on child language development.
Also, some of the difficulty with testing the hypothesis arises from the instability (indeterminacy) of the child's language itself. First, it is a curious fact of language development that children appear to "unlearn" something that they had mastered earlier. For example, one child could pronounce the word pretty perfectly between ten and twelve months of age only to mispronounce it as bidi just a few months later (Leopold 1949). Secondly, children sometimes appear to avoid producing certain sounds, preferring words with sounds they have mastered (Leopold 1947; Ferguson and Farwell 1975; Ferguson 1978). Finally, children seem to prefer words with a CV or CVCV shape, imposing a specific pattern on their early words regardless of the adult models. Some children almost exclusively attempt words with the CVCV pattern, as in bidi ("pretty"), gogi ("doggy"), mama, dada, and baby. Those three examples of indeterminacy show linguists that children's language production can be very selective, creating difficulties when testing any hypothesis.
Jakobson's hypothesis is an impressive attempt to explain the emergence of speech sounds, but there is still much about language development that the hypothesis does not even attempt to explain. In addition to studying the child's early speech sounds from the point of view of linguistic universals, linguists have also tried to describe and explain the regular, predictable patterns that make children's speech systematically different from an adult's. For instance, consider the following conversations between two children (at about eighteen months) and their fathers:
- Father: what do you want for a treat?
- Liz: tate ('cake')
- Father: Nana (= familiar name for grandmother) doesn't have any cake do you want a cookie? or a brownie?
- Liz: (points) bownie
- Father: what do you say?
- Liz: pease ('please')
- Father: you should have some milk with that (begins to pour into cup)
- Liz: top! ('stop') ... (points) tawpi ('coffee')
- Father: you can't drink coffee!
Consider the next conversation (adapted from Halliday 1975: 89) also:
- Father: what did you see yesterday?
- Nigel: ka ('car')
- Father: yes, and you went for a ride in a car, didn't you? and what did you see up there (pointing)?
- Nigel: ta ('tower')
- Father: yes, you say a tower ... and what did you pick in the garden?
- Nigel: daydi ('daisy')
In the short conversations above, one sees examples of the following systematic differences between the child's and the adult's speech (compiled from Velten 1943; Jakobson 1968; Edwards 1971; Olmsted 1971; Smith 1973; Ingram 1971, 1974, 1976; and Ferguson 1978):
- Voicelessness. Children often do not 'voice' (add vibration of the vocal cords) consonants at the ends of words, for example,
badbat,
bagback.
- Stopping. Children often employ a stop rather than a fricative at the beginning of a word, for example,
fattat,
shippip.
- Gliding. Children often use a 'glide' (/w/ or /y/) rather than an liquid (/r/ or /l/) systematically: /w/ occurs instead of /r/ and /y/ instead of /l/, for example,
rabbitwabbit,
presidentpwesiden,
lipyip,
lollipopyoyipop.
- Front articulation. Children systematically choose consonants that are articulated farther forward in their mouths, for example,
keytee,
godoh,
thickfick,
shipsip,
rabbitwabbit.
Notice that the systematicity is greater than first appears: the consonants the children use match the adult consonants in terms of voicing and manner of articulation (stops or fricatives); the systematic difference is the child's use of consonants articulated more toward the front of the mouth. Diagrammatically, this looks like:
FRONT
OF
MOUTHPoint of articulation BACK
OF
MOUTHlips teeth alveolus palate velum /f/ < th sound /t/ < /k/ /d/ < /g/ /s/ < sh sound /w/ < /r/ - "One-C-at-a-time." Children often use single consonants rather than consonant clusters, for example, spoon
poon, play
pay, stop
top. It is almost as if they work at "one-consonant-at-a-time."
- Epenthesis. Children occasionally will use a vowel to separate the two consonants of a cluster, for example, blue
bahlue, please
pahlease. Epenthesis is the use of a vowel where vowels usually do not occur. Notice that this is another systematic way for children to use single consonants rather than consonant clusters. Children will also employ epenthesis at the ends of words, for example, pig
piga. In effect, epenthesis here moves the consonant out of word-final position. See (7) next.
- Word-final consonants. Children occasionally do not articulate consonants at the ends of words, for example, bad, bat, bath, bag, and back
baa.
- Assimilation. Children will use consonants which are phonetically similar. Assimilation is any process which makes sounds more alike, for example, lamb
nam. The /n/ is more like the /m/ of lamb than the /l/ is since /n/ and /m/ are both nasal consonants, forcing air to flow through the nasal, rather than the oral, cavity.
- Reduplication. As mentioned earlier, children will use repeated consonant and vowel sequences in consecutive syllables, a phenomenon known as reduplication, for example,
daddydada,
bottlebaba,
waterwawa.
- "Stressed-syllables-only". Many children articulate only the stressed syllables of polysyllabic words, for example,
banananana,
potatotato.
The explanation for those systematic differences in pronunciation fall into two large groups: perceptual and articulatory. Perceptual explanations posit that children can not perceive all of the differences that are significant to the adult; therefore, they articulate words as they perceive them. This seems false since children regularly reject imitations of their own speech, as in this exchange:
- Father: ask Julie what we had for dinner
- D: what did you have for dinner, Julie?
- Julie: fiss ('fish')
- D: you had fiss (heavy stress, rising tone) for dinner!
- Julie: no, not fiss fiss!
Julie's last remark demonstrates that she perceives the difference between her articulation and the adult's, but even when she scolds D and attempts to correct his mocking imitation of her pronunciation, she does not articulate the sh sound. Children regularly respond negatively to other's attempts to imitate (or correct) their speech:... a child asked if he could come along on a trip to the "mewwy-go-wound." An older child teasing him, said "David wants to go to the mewwy-go-wound." "No," said David firmly, "you don't say it wight" (Maccoby and Bee 1965: 367). Clinical and experimental studies reveal considerable variation in the development of children's abilities to perceive speech sounds (Garnica 1973; Edwards 1974). Additionally, there is considerable variation in the sequence of perceptual development. The evidence does show, however, that perception precedes production, often by a considerable interval. During that interval, there is no production of speech sounds that children can accurately discriminate (Morse 1979; Miller and Eimas 1983).
The alternative explanation is an articulatory one. Possibly children simplify their linguistic systems by actively working with fewer contrasts than they can perceive to ease the burden on their short term memories and simplify the motor coordination necessary to articulate certain sounds (Oller 1974). Or perhaps children have not yet matured enough to master all of the fine motor control of their articulatory organs (Glucksberg and Danks 1975). As children mature, their articulation approximates ever more closely the adult system as they develop a fairly sizeable productive vocabulary with which they can relate words to each other and to the adult system (Ingram 1976).
There is a common, but quite possibly incorrect, suggestion by many linguists that children have the adult forms of words "in the back of their heads," as it were, when they attempt to speak their first words. Under this hypothesis, the child has stored a set of words in his/her mind, and the systematic pronunciation differences outlined above serve as the "rules" of pronunciation to convert the "underlying" mental words into speech (see Smith 1973). Such a hypothesis is difficult to test, and the data are not conclusive. Other linguists caution against assuming that the correct adult pronunciation always underlies a child's mispronunciation (Waterson 1981).
Most of the research summarized so far concentrates on the first two or three years of life. One should remember, though, that the development of a child's sound system is far from complete by that age. Pronunciation of some speech sounds, such as the th, sh, and r sounds, continues to trouble many children until the early school years. Also one should remember that individual children vary in their development considerably. It is said that Albert Einstein did not speak his first word until the age of three.
Vocabulary
All of those sounds and sound processes do not occur in a vacuum, of course. Their purpose is to make meaning through words. First words are significant for parents since they represent a crucial milestone in the intellectual and social development of their child. For linguists, first words are significant since they represent the child's move into the tri-level structure of adult language.
Before the first words, children represent the content of their message through their expression directly. The content of the message (the meaning) is equivalent to the expression (the sound), and in early communicative acts, that is all there is meaning and sound (Halliday 1975; Peters 1983):
uh, uh + pointing, outstretched hand = 'I want that'
EXPRESSION
=
CONTENTThere is no intermediary level of form. Form (a level of organization consisting of a grammar and a vocabulary) is the layer of language that mediates between expression (sound) and content (meaning). This intermediary level is commonly called 'wording' (that is, a vocabulary in a grammatical structure) existing between meaning and sound. First words add that intermediary level to the picture above, a level necessary for the transition the child must make into the adult system. As the child's grammar and vocabulary develop, they are able to convey nuances of meaning, to choose forms that focus and emphasize certain elements of meaning in their messages:
gimme |please|
|uh,uh |
|wanna |
|gimme |
|mine | 'I want that' the EXPRESSION chosen from a set
of possible FORMsto emphasize various
parts of message
in the CONTENTLearning a vocabulary is a remarkable achievement not only because it marks the child's first steps into the adult language but also because of its vastness and complexity. Children will learn their first word between nine and twelve months, and they will have a vocabulary of ten to twenty words by fifteen months. Children will average fifty to one hundred words by twenty months and will command an active vocabulary of two to three thousand words by six years of age. Children's vocabularies will grow as they do, and as adults they will master an active, working vocabulary of fifteen thousand words. Studies have shown that the passive, recognition vocabulary of high school graduates ranges from thirty to one hundred thousand words. Assuming that a child learns only the average of fifteen thousand words, he/she is learning at a rate of about four (to ten) new words per day.
Furthermore, the complexity of knowledge associated with learning a word is impressive. To learn a word, a child must learn
- its sound pattern,
- its referential meanings (what the word refers to in the real world),
- its use with prefixes and suffixes (un + help + ful = unhelpful),
- its function in the grammar of the language (for example, run will function as a verb in I'm running but not as an adverb in I am very runly), and
- its collocational restrictions (the sets of words that frequently occur with one another; for example, a child asked to say the first word that comes to mind on hearing the word throw will respond ball, not toss or the fight or a fit, and if asked to define hole will say a hole in the ground (Cazden 1972: 72). Throw and ball collocate more frequently with each other, as do hole and ground, and children learn that).As the child grows, so grows the child's knowledge of words to include
- its register restrictions (how different words are picked for different social occasions; for example, the child may understand that throw up is the expression used at home, but vomit is used at the doctor's office),
- its idiomatic uses (for example, a word like black can be used in blackboard even if the board is green, that a phrase like take advantage of works as a single unit, the meaning of which does not come from take + advantage + of),
- its metaphorical extensions (for example, a word like crown in "Jack fell down and broke his crown" is a metaphorical extension of the meaning of crown to refer to the head),
- its dialectal restrictions (for example, a word like soda in one place means pop in another or a word like bad can mean 'good' in some social dialects),
- its range of synonyms, antonyms, and hyponyms (for example, synonyms relate words like pretty with beautiful, antonyms relate hot with cold, and hyponyms relate words like red, green, blue to the word color), and
- its graphic form (what the word looks like and the related spelling rules if the child becomes literate).
Though learning a vocabulary may seems like a laborious effort of trial and error on the child's part, it really is more systematic than it at first appears. While it is not possible to speak of developmental stages as in the sections on sound above or grammar below, there are two regular and predictable patterns of learning a child experiences as he/she develops new words.
First, children's definitions develop along regular patterns. Many proud parents, eager to show off their one year old who has just learned her first words (mamma and dadda), turn red-faced as they discover that the baby starts calling out to every male in sight "Dadda!" As they develop definitions, children will often begin by identifying a particularly salient feature (say maleness) and attach that feature as the meaning of the word dadda, a process known as overextension (Clark 1973). Eventually, the child will learn that her meaning of dadda is overextended and restrict the meaning to a single male, her father.
Similarly, in a related phenomenon, children often build networks of words related by a sequence of features. For instance, one child, on learning the word doggy, shifted the meaning to squirrels, then to a soft blanket, and even to his uncle who crawled on hands and knees while playing with him, all on different occasions. The meaning seemed to shift from object to object, failing to identify any single class of objects (like animals of the species canine), shifting between different perceptually salient features, like four-leggedness, then softness, then objects or people that move themselves. Such networks of words are known as complexive concepts (Bowerman 1978). This interconnecting networking of word features was noted earlier by several psychologists and linguists (Vygotsky 1962; Brown 1965; Bloom 1973), but they all suggested that those interconnecting networks between unrelated words were short-lived, primitive stages of word meaning, abandoned later as the child learns more about the defining features of each word in the network. Bowerman (1978), however, points out that complexive categories of meaning persist in the child's developing language, and Rosch and Mervis (1975) have suggested that such loosely associated networks of meaning are typical of many adult word meanings.
Furthermore, children will also fail to apply a word that they know to other objects in that word's class, a process called underextension. For example, young children between two and three often use the word animal as an adult would use the word mammal. They will agree that dogs, cats, or horses are animals but deny that birds, insects, fish, or people are.
Children's early definitions fall into those three categories for several reasons. As mentioned above, part of knowing vocabulary is understanding its hierarchical patterns (hyponyms) that relate very general vocabulary, like the words creature, animal, mammal to more specific vocabulary, like dog, beagle, and Snoopy. Hyponyms can be displayed like this:
![]()
It seems that child learn the "middle" levels of such hierarchies first (Anglin 1977), reflecting the naming practices of parents (Brown 1958, 1965). Parents are very practical in naming items for young children, giving them just enough specificity as they need. Speaking with adults, parents will use words like Ford, Chevy, Dodge, etc, but speaking to a child will simply use car. Likewise, parents are more likely to use flower (rather than plant or chrysanthemum), dog (rather than animal or terrier), and so on. Thus, children's vocabularies are simultaneously overextended and underextended when they learn a new word since they also must learn the relationships that exist between words.
The second regular pattern in the development of children's vocabulary is the order with which they learn their first words, and the discussion of hyponyms above illustrates part of how children order the development of their vocabularies. Yet another clue to the order of development was given by the psychologist Katherine Nelson (1973), who studied the first fifty words of eighteen children. (Most children learn at least fifty words before putting words together.) Nelson discovered considerable consistency in the order children learn their first fifty words: using a six part classification scheme, she observed that children learn nouns earlier than action words and meaningful, content words before grammatical function words. Nelson's complete scheme is
general nominals 51% (ball, dog, snow) specific nominals 14% (mommy, daddy, Snoopy) action words 14% (give, byebye, up) modifiers 9% (red, dirty, outside, mine) personal/social 8% (no, yes, please) function words 4% (what, for, on). The discrepancy in favor of nominals may be due to the relative difficulty in understanding and processing action words since action words are abstract, relational words expressing dynamic concepts while nominals can function as names representing static, concrete entities in the child's environment. Additionally, very young children seem to favor words that have some meaningful "weight" to them like ball, mommy, up rather than words that serve to express some social or grammatical function like mine, no, on.
Most children reach the fifty word threshold quickly and begin combining two words together (at about twenty months remember), and at that point there is an explosion of grammatical structure that parallels the explosion of vocabulary.
Grammar
Learning the grammatical structures of language is no less a remarkable achievement than learning the vocabulary is. By about eighteen to twenty months, the average child is creating his/her first two word utterances, and by twenty-five months, two word utterances make up the majority of the child's speech. When the child is three years old, on average, he/she is able to create three and four word utterances, and as the child grows, the grammar grows too, ever increasing in the depth of its complexity and the breadth of its variety. Indeed, like vocabulary, the development of grammar need never end since people can continue to learn new grammatical patterns as they learn new styles of speech and writing, new ways to express themselves with flair and emphasis. Many grammatical structures, particularly those involving coordination and subordination, are not fully mastered until adulthood (Kies 1985 and 1990).
Yet, as mentioned above, age is a very unreliable measure of language development. Different children, months apart in age, could both be using two word utterances. Therefore, Brown (1973) devised a measure of grammatical development in children independent of chronological or mental age, the notion of Mean Length of Utterance (MLU).
MLU approximates of the average length of the child's utterance measured in morphemes (the smallest meaningful components of words). For example, unhelpful consists of three morphemes, un + help + ful, a base morpheme help and two lexical morphemes un and ful. Each morpheme contributes a meaningful component to the whole word. Dogs consists of two morphemes, the base morpheme dog and the grammatical morpheme -s, meaning 'more than one'. Brown carefully devised a series of rules for collecting utterances to ensure that only meaningful morphemes are collected; for instance, if a child says doggies for every canine he/she sees, singularly or in groups, there is no evidence to say that there are two morphemes in the child's word, and that word is counted as one morpheme.
Brown defined Stage I of grammar development as the period from the appearance of the child's first utterances to an MLU of 2.0. Brown simply defined the successive Stages II, III, IV, and V as .5 increments of the child's MLU from 2.0 on.
Stage I: The Holophrastic Stage
At the beginning of Stage I (about ten to twelve months of age), the child speaks one word at a time, but by the end (about eighteen to twenty-four months), uses predominantly two word utterances. The one word utterances from the child early in this Stage are often called holophrases since the child expresses meaning of an entire phrase, clause, or sentence in the one word he/she utters. As mentioned above, the early holophrase consists primarily of nouns and verbs (words denoting more concrete physical and motor operations perhaps), while adjectives (words denoting more abstract attributes) are learned later.
From the perspective of the adult, there is enormous ambiguity in the child's holophrases. The child who says milk, for instance, may be requesting ('I want some milk'), announcing ('I see some milk'), or reporting that he/she has just spilled the milk. This ambiguity is lessened by the social and situational context, and adults use the context of speech situation to interpret the utterance. Thus, the child who toddles into the kitchen after playing outside saying milk probably is requesting ('I want some milk') and the adult will respond accordingly. This enormous ambiguity is true of Stage II as well as Stage I.
When a child can only say one word at a time to express what he/she wishes, which word will be chosen? There seem to be three answers, each valid from a different perspective. From a static perspective, simply listing all the overextended words in the child's Stage I speech, it seems that children speak about items of their environment that are perceptually salient (appealing to their senses). Clark (1973) could group overextended single word utterances by categories like movement, shape, size, sound, taste, and texture. From the dynamic perspective of dialog, a different picture of one word utterances appears. For most children, the crucial criterion for choosing a word in dialog is informativeness (Greenfield and Smith 1976). The child will utter the one word that introduces the greatest degree of new information into the speech situation, information that is not known through either earlier dialog or the context of speaking. Consider this dialog between Nigel (age eighteen to nineteen months) and his parents:
- Father: you went on a train yesterday
- Nigel: too-too, bah-bah (= 'when I got off, the train went away and I waved to it')
- Father: and you said bye-bye to the train
- Nigel: ahn (= 'another')
- Father: and you saw another train?
- Nigel: (long list of things seen then:) wla ('flag')
- Mother: oh, you saw some flags?
- Nigel: (holding out palm) gra ('gravel')
- Mother: and you had some gravel
- Nigel: (touching palm, lips rounded, very quiet) ooh
- Father: and you hurt your hand with the gravel?
- Mother: no, that was with the stick, the one with prickles on
- Nigel: blah ('blood')
- Mother: and there was blood on it, yes
- (adapted from Halliday 1975: 90)
Notice how Nigel drives the conversation forward by choosing words conveying a relatively high degree of new information.
Another factor influencing word choice at this stage is the child's attempt to express multiword utterances by selecting one word at a time and uttering them sequentially throughout a dialog (Bloom 1973), as in this exchange between Liz at nineteen months and her mother:
- Mother: (phone rings) I have to put you down to answer the phone
- Liz: no
- Mother: (answers phone)
- Liz: (tugging on hand) dow ('down')
- Mother: (continues phone conversation; holds child's hand; looks at child) shh!
- Liz: up, up, up
It is not difficult, Bloom suggests, to suppose that the child is uttering the multiword message no down; up 'Don't put me down; I want up!' one word at a time, spreading the message through the interaction. This same strategy will be used at the two word stage to express concepts that otherwise might need a complex clause:
- Liz (at twenty-two months): bring home
- Father: what should I bring home?
- Liz: read it ('Bring home the book I want so I can read it')
Stage II: The Joining Stage
At Stage II, children are producing two and three word utterances and are between eighteen and twenty-four months old, although there is great individual variation. Stage II often starts with two holophrases uttered in rapid succession, baby and after a short pause car, pointing to a toy. Soon, however, the child is uttering the two words as a single intonation unit.
In fact, prosody is a clear signal that the child has entered Stage II. Holophrases are articulated with an equal degree of stress on each word, as in DADDY, GO; by contrast, multiword (Stage II) utterances are articulated with main stress falling only the word that conveys the highest degree of new information, as in daddy GO. Thus by Stage II, children have learned one part of prosody found in the adult language: stress can make a meaningful difference in speech and usually falls on that element of the message that carries the greatest degree of new information. BABY car, with stress on the first word, expresses 'possessor + possession', while baby CAR expresses an 'agent + object' relationship. (An agent is the conscious initiator of action, the actor.) Indeed, one could rank the elements of Stage II speech according to their relative degree of information value and their relative degree of stress to see the correlation between the two:
higher
information
valuenew or
contrastive
informationMAMA go [not Dad] higher degree
of stresslocation baby go HOME possession BABY car noun object push CAR lower
information
valueaction baby GO lower degree
of stresspronoun object KISS 'em agent daddy KISS At the top of the list are utterances like MAMA go as an answer to a question "Who went to the store?" MAMA represent the highest degree of new information in the utterance since the verb go was already available from the context, and MAMA is in contrast with others in the child environment who may have gone to the store. At the bottom of the list, the pronoun object 'em conveys a low degree of new information. As in the adult language, pronouns in general refer to items already known to the conversationalists, either through previous dialog ("See puppies. Kiss 'em") or through the context of speaking (child walks into the room with a puppy and states "Kiss 'em"). In such an utterance, the main stress falls on the verb. Similarly, agents are usually known from the context of speaking. Thus, daddy in "daddy kiss" will convey a low degree of new information, since daddy is known contextually, and the main stress will fall on the verb kiss.
One rarely finds grammatical function words in Stage II speech (words like prepositions, pronouns, conjunctions, and articles), except for the first person singular pronoun in the objective case, me. If pronouns do appear in Stage II speech, they are more frequently objectives (me, him, her, them) than subjectives (I, he, she, they). Also grammatical word endings do not occur at Stage II (word endings like the -ing, -ed, -s endings on verbs or the -s , -'s endings on nouns). If those endings do occur at this stage, there is no evidence that the child knows their functions; instead, the child seems to treat them as part of the word itself.
There are two major approaches to the study of a child's grammatical development starting in Stage II structural and functional. Early structural studies revealed that some words always appeared in a fixed position. The majority of fixed words occur in the first position of a two word utterance, the remainder always in the second position. Examples of those fixed words are that, there, allgone, my, dirty, and more. Those fixed words were labelled pivot words (Braine 1963) because they serve as a fulcrum, a point of departure, for the child's utterances. Dozens of open class words, frequently nouns at this stage, follow to form the two word utterance. The words of the open class (but never the pivot class) may occur together or alone as holophrases. A schematic rule illustrating the pivot-open grammar of a child's two word utterances looks like
![]()
where P1 represents pivot words that occur in first position only, P2 represents pivot words occurring in second position only, and O represents the open class words. The rule states that the child's sentence has one of only four possible structures.
Structural studies of children's early grammar created considerable excitement in the 1960s since those analyses suggested that children's early utterances are not random groupings of words. They also suggested that children are not imitating the adult speech they hear around them. Finally, they also seemed to suggest that language learning follows a universal design: just as all children go through a babbling stage and a holophrastic stage, so too do they go through a stage where their speech is constrained by the pivot-open grammar.
Continuing research quickly demonstrated, however, that structural descriptions (and the pivot grammar above) were of limited value. First, Bowerman (1973) discovered that children do use pivot words alone as holophrases, that they shift pivot words from first to second position (as vice versa), and that they produce P + P utterances. In essence, adding those structures to the possible structure already outlined above, pivot grammars simply say
Sentence
Word + (Word)
where the parentheses indicate that the second word is optional. Such a structural description is worthless; it says nothing new.
Secondly, structural descriptions can not capture the meaning of the expression, only its grammatical shape. Therefore, several people began to explore the child's language from a functional perspective. Studying the functions and uses served by the child's utterances could be the key to understanding how the child is developing grammar (form) to express meaning (content).
Bloom (1970), for example, notice that one child said mommy sock in two very different contexts, with two very different meanings ('possessor + possession' as in 'mommy's sock' and 'agent + object' as in 'mommy wears a sock'). A pivot grammar assigns the same structural description to both uses of mommy sock, missing the meaningful, functional differences between them. Likewise, expressions such as mommy chair and big bird have the same structural description of mommy sock, but the pivot grammar description misses the 'agent + location' meaning of 'mommy is sitting in the chair' and the 'attribute + object' meaning of 'a bird that is big'.
Following along functional lines, Brown (1973) found that seventy percent of the utterances in late Stage I and Stage II could be described by a small set of functional relationships between words:
- 'agent + action' baby kiss
- 'action + object' pull car
- 'agent + object' daddy ball
- 'action + location' sit chair
- 'object + location' cup table
- 'possessor + possession' mommy sock
- 'object + attribute' car red
- 'demonstrative + object' there car
In a cross-linguistic study of functional relationships, Slobin (1971) found that children of approximately the same age from six different languages (English, German, Russian, Finnish, Luo, and Samoan) expressed similar kinds of meanings at Stage II: utterances were used
- to locate or name objects and people there book;
- to request, demand, or indicate a desire for people, objects, or events more milk;
- to negate or indicate refusal or rejection no wash;
- to express situations or events kitty go;
- to indicate possession mama dress;
- to describe doggy big;
- to question with both wh-questions and yes/no questions where ball?, daddy go?
In the first three functions Brown's list and in the fourth function in Slobin's list, one can see the child's earliest attempts to code the functional categories 'agent', 'action', and 'object' into grammatical categories of subject, verb, and object.
Halliday (1975) provided the most detailed study of language development from a functional point of view. His son Nigel's earliest language expressed seven functions:
- the instrumental 'I want' (the child seeks satisfaction of material needs);
- the regulatory 'do as I tell you' (the child regulates the behavior of others);
- the interactional 'me and you' (the child interacts with others);
- the personal 'here I come' (the child expresses personal feelings, interests, pleasure, disgust);
- the heuristic 'tell me why' (the child seeks to name things);
- the imaginative 'let's pretend' (the child creates a personal environment); and
- the informative 'I've got something to tell you' (the child communicates information).
Those functions themselves have a developmental course. Between nine and sixteen months, children employ the instrumental, regulatory, interactional, and personal functions. The heuristic and imaginative functions appear between sixteen and eighteen months, and the informative is added around twenty-two months. Initially, children's utterances express one function, one meaning at a time, but as they develop grammar (including vocabulary) and engage in dialog, they learn to use language to convey several functions, several meanings simultaneously.
Stage III: The Combining Stage
The age at which children make the shift from Stage II to III varies greatly. By two years of age, some children are well into Stage III, but others will use two word utterances exclusively to age three (and sometimes beyond).
At Stage III the structure of questions and negations evolves. Stage II questions are articulated by rising intonation or by beginning a sentence with a wh word, mommy pinch finger? or where me sleep? Stage III witnesses the emergence of grammatical morphemes like auxiliary verbs do and should (see the discussion below). The results can be seen in late Stage III questions, did mommy pinch finger? and where I should sleep? Likewise, Stage II negations simply add the negative words no, not, never to the beginning of the utterance, no sit there or no dog bite. By Stage III, the negatives too have evolved, there no squirrels or dog no bite.
As one can see, children begin to master word order at Stage III, although they still have much to learn. For example, children commonly interpret the first noun phrase in an utterance as the 'agent' and the second noun phrase as 'object' (receiver of the action). If any sentence were to disrupt the order 'agent + object', one would expect that children at this stage would misinterpret it. Not surprisingly, that is exactly what happens: 'passive' sentences in English express the 'object' before the 'agent' as in Liz was followed by the dog, where dog is 'agent' (the actor, the initiator of action) and Liz is the 'object'. Such passive sentences are routinely misinterpreted as 'Liz followed the dog' by children at this stage, even if they understood some passive sentences correctly at an earlier stage.
At the beginning of Stage II, children express 'agent + action', 'action + object', and 'action + location'. At the three word stage, children can express 'agent + action + object' and 'agent + action + location'. At the four word stage, children can express 'agent + action + object + location'. It is as if some developmental limitation, on short term memory perhaps, has been lifted so that the full form, implicit in the early joining stage becomes explicit at the combining stage. Bloom (1970) and Brown (1973) hypothesized that children's grammar develops either by combining the joining stage functional relations, as in:
(agent + action) + (action + object) ![]()
'agent + action + object' (baby throw) + (throw ball) ![]()
baby throw ball or by expanding one functional relation into a new functional relation, as in the following where a single functional relation, 'object', expands to express a new functional relationship, a new meaning, of 'possessor + possession':
![]()
Yet despite these significant grammatical developments, the child's language at Stages II and III is often described as "telegraphic speech" since so many of the words commonly omitted in telegrams are not expressed regularly, for instance put dolly table or there mommy shoe. In fact, grammatical function words (auxiliaries, articles, prepositions, etc) and word endings (plurals, possessives, progressives, etc) are just beginning to appear systematically.
Brown (1973) studied the order in which children learn fourteen grammatical markers that first appear systematically in Stage III speech: endings of the verb and noun, articles, prepositions, and auxiliary verbs. The patterns of development are remarkably consistent from child to child and even show consistency across languages. The list below ranks the grammatical morphemes in the average order in which they are learned (there is some small variation in this order from one child to the next, though the individual differences are slight):
- Morpheme
- 1. present progressive -ing
- 2-3. in, on
- 4. plural -s
- 5. past irregular
- 6. possessive -'s
- 7. uncontractible copula is
- 8. articles
- 9. past regu