Tag Archives: two-lexicon model

Menn and Matthei (1992) The “two-lexicon” account of child phonology (Part 2)

In the previous post, I described Menn and Matthei’s assessment of progress on the two-lexicon model. They highlight several advantages of the model, but also note problems, including the apparent competition between children’s “selection rules” (or rules specific to the output lexicon), as well as non-deterministic cross-word patterns. To combat these and other problems, MM suggest that the formalism of the two-lexicon model migrate from a generative perspective to a more connectionist one. At this point, they make a very handy list of the key generalizations they would like to capture with a revised, connectionist two-lexicon model, or with any model of child speech production for that matter. I have restated them here, while keeping MM’s original groupings.

Reduction of Information

  1. Children recognize more words than they can say
  2. Children recognize more phonemic contrasts than they can realize in speech
  3. Early productions tend to cluster together in terms of phonetic properties
  4. Early productions also tend to contain a limited set of phonetic elements


  1. Children’s productions appear to be simplified (compared to adult forms) and often appear systematic (many words share a pattern)

Inertia of the System

  1. Early, frequently produced words may retain a high level of fidelity, resulting in “phonological idioms” compared to more recently acquired production forms
  2. Changes in systematic productions tend to happen to newly acquired words; more established words are more resistant to change

We could also add to this list MM’s frequent observation that imitated production forms tend to be much more like adult forms.

To provide a general feel for a connectionist model of early speech production, MM lay out the “initial settings” for such a model. With respect to connections, MM posit simultaneous and sequential connections. Simultaneous connections link the speech modalities of motor commands, auditory percepts, and kinesthetic sensation (of one’s own productions). The three modalities, motor/auditory/kinesthetic or MAK, must be wired together efficiently by learning. Sequential connections are within-modality connections that represent change over time. So, a simultaneous connection might link together the feeling, action plan, and acoustic record of a [b], while sequential acoustic connections might link together the [b] burst to the following formants of an [a] vowel in the syllable [ba]. Although MM do not make this explicit, it appears that sequences of connections also represent stored forms, or words.

Next, MM lay out a series of what I will call linking mechanisms. First, sequential auditory patterns can be stored and learned by attention to adult speech. Second, there is an internal feedback loop, which MM relate to babbling, which has a basic predictive property that allows the model to guess how a sequential motor pattern might sound and thereby modify it to observe whether the result is the same or different (essentially a supervised learning component provided by the stored, “correct” adult forms). Third, imitation will result in links between stored adult-produced auditory sequences and the child’s own MAK sequences. Fourth, stored adult sequences will be associated with real-word states (meanings), which then leads associations between the child’s own MAK sequences and real-world states.

MM give a fair amount of attention to the idea that adults might assist in the development of a child’s MAK sequences. The basic idea is that an adult mimics the phonetic properties of a child’s utterance (absolute pitch, formant values, etc.). Here’s an explanatory quote: “A purely sound-based imitation of the child by the adult…will produce links between the child’s internal MAK associations and the sound of the adult’s voice, the child’s innate normalization abilities should be enhanced.”

Once normalization is established (although I’m not sure why it needs to be established first in this proposal), the child might seek to produce words in a more adult-like fashion. MM propose that social factors like semantically contingent responding by parents (Snow, 1977) could provide such a mechanism. MM conclude by saying that their connectionist model is not fully developed, and that many attractive qualities of the old two-lexicon model, like the selection rules, have been replaced by vaguer concepts. However, they believe that the absolute boundaries of the input and output lexicons in the original model simply do not serve us, and we should abandon them.

My primary concern with the connectionist model that MM propose is that it seems to completely abandon the original problem that the two-lexicon model addresses. Looking back at their list of key generalizations, I would single out two, but the connectionist model does not clearly address either. First, how is it that children can recognize more words/sounds than they can produce? Second, why are children’s early productions both simplified and systematic?

It’s difficult to see how the proposed connectionist model makes headway on these problems. In fact, it seems as if they have been replaced with several other problems in the study of child speech. The discussion of speech normalization is a perfect example. Given general agreement that toddlers have a good understanding of the perceptual form of their native language, this problem could be assumed to be solved at the time that production begins. For example, I know of no evidence that children ever attempt to imitate the absolute values of any acoustic property of adult forms, which seems to be a major problem if we want to address normalization.

To conclude, I generally see the box-and-arrow iteration of the two-lexicon model as being preferable, if only for specificity. Athough I agree with MM that the box-and-arrow model could be replaced advantageously by a connectionist model, the advantages are simply not clear enough here. In the future, I will present a more recent attempt at a connectionist network by Menn and colleagues, which may address the perception-production disparity more directly.



Snow, C. E. (1977). The development of conversation between mothers And babies. Journal of Child Language, 4, 1-13.

Menn and Matthei (1992) The “two-lexicon” account of child phonology (Part 1)

Menn and Matthei (hereafter MM) begin with some information about the historical development of the two-lexicon model. They quote a paper by Ferguson, Peizer, and Weeks (1973), who noted a general human tendency to know more words than are typically said. That is, both children and adults know words that they rarely or never say. Thus, there seems to be a set of lexical representations for which the details of production are either murky or nonexistent, and we might hypothesize a split between input and output representations (Ingram, 1974), in other words, two separate lexicons.

So long as there is a consistency in children’s pronunciations, however, separate lexicons are unnecessary. If there is a regular mapping between the input representation (presumed to be identical to the adult forms) and the output representation, then a set of rewrite rules that capture the mapping are sufficient, and no output lexicon is needed. However, children are rarely consistent, and MM provide the example of two words (“down” and “stone”) that move in and out of a nasal harmony rule: They start out with no harmony ([dawn] and [don], resp.); the harmony rule then applies to other words (/binz/ –> [minz] and /dæns/ –> [næns]); finally, the harmony rule overtakes “down” and “stone”. With inconsistent mapping across similar words, rewrite rules are not helpful, or at least require arbitrary exceptions. Granted, two-lexicon models must also have lexical exceptions, but there are other advantages.

One of these advantages is that arbitrary exceptions in a one-lexicon system lead to more serious problems. The example is from Smith (1973) as interpreted by Macken (1980). The data comes from the child, Amahl, who displayed a pattern of velar harmony (/tr^k/ –> [kr^k]). Eventually, the pattern gave way to accurate production of alveolars, but one word, “took”, persisted as a regressive idiom, [gUk].

Macken assumes that this is possible because Amahl must have learned /gUk/ as the underlying form. Thus, when the harmony rule disappeared, /gUk/ would still surface as if harmony applied. As MM point out, however, this assumes that the child perceives “took” as /gUk/, which would lead us to expect that Amahl would not understand “took” as produced correctly. This seems highly unlikely, especially given our present-day understanding of children’s perceptual abilities. Furthermore, the example above with “down” and “stone” resisting a nasal harmony rule does not make sense if we assume exceptions are cases where the child has learned his own productions as underlying forms. At the very least, it would suggest that the underlying forms of words where nasal harmony does apply are perceived as if they had initial nasals. That defeats the advantage of the one-lexicon model, however, where we assume child and adult underlying forms are the same.

An output lexicon is helpful in this case because it provides a space for pronunciation representations that may be linked by a rule that operates across words or by arbitrary connections between input and output forms. Just as importantly, the output lexicon still allows children to be able to accurately perceive those words. That is, the output lexicon provides a storage facility for consistent or variable output representations while allowing for stable and accurate perception.

Despite the advantages, MM detail several problems they see with the two-lexicon model. First, it appears that selection rules—or the rules that lead to childlike forms in the output lexicon—sometimes operate over two words. This is problematic, however, if we take up the very standard assumption that combining words is done by the syntax and word combinations do not exist in the lexicon.

Another problem is that selection rules may sometimes be in competition with one another for a given word. MM give the example of productions by the child Daniel (also discussed by Menn in previous papers, I believe) of “boot” and “boat”, which are variably produced as [bup-dut] and [bop-dot] respectively. Thus, there appear to be separate labial harmony and alveolar harmony rules that compete in terms of realization of the same word. MM point out that there isn’t any sort of formalism in the two-lexicon model that allows for rule competition.

Other problems are given through the examination of daily changes in a couple of diary studies. For example, a child Jacob exhibited something like a vowel convergence, where [i] was produced like [ε]. So “tea” is first produced as [di] and then as [dεi]. “Key” was produced first as [ki], then as [xiε], and finally as [xε]. At the same time words with a mid front vowel switched between a low and high specification: “tape” was produced with both [i] and [e]. Ultimately, MM conclude that these similar words must be influencing each other in terms of production, but in a very unruly way. Similar cases are given for stress placement on two-syllables words beginning with [k] and over-application of the plural/3rd singular/possessive morpheme.

I’ll stop here for now. My next post will summarize what MM want to explain and then review the connectionist model that MM propose as a revised two-lexicon system.



Ferguson, C. A., Peizer, D. B., & Weeks, T. A. (1973). Model-and-replica phonological grammar of a child’s first words. Lingua, 31, 35-65.

Ingram, D. (1974). Phonological rules in young children. Journal of Child Language, 1, 49-64.

Macken, M. A. (1980). The child’s lexical representation: The ‘puzzle-puddle-pickle’ evidence. Journal of Linguistics, 16, 1-17.

Smith, N. V. (1973). The Acquisition of Phonology: A Case Study. Cambridge: Cambridge University Press.