Hewlett begins discussion of dual lexicon models with basic premise that, if children have accurate perception but inaccurate production, then “there is not just a single, modality-independent lexicon in which phonological representations are stored.” (p. 28) Hewlett lists several advantages to this basic framework. First, lexical avoidance (Schwartz & Leonard, 1982) is easily explained. Second, the “rules” like fronting and gliding that apply to child speech do not need to occur in real time. In many ways, this is helpful for explaining why the rules apply to environments, rather than to particular words. Exceptions abound, however! These exceptions include regressive idioms, where a child produces a word incorrectly even though similar words are generally produced correctly; and progressive idioms, where a child produces one word correctly when similar words are produced incorrectly. The problem with idioms is where Hewlett strikes out on his own, proposing a revised dual lexicon model.
It seems likely that reproducing the box-and-arrow model from the chapter would be a violation of copyright, so I will do my best to provide verbal descriptions for now. There are four four key boxes in the model (clockwise from upper left): the input lexicon, the output lexicon, a motor processor, and a motor programmer. The input lexicon is where incoming acoustic signals are matched to stored lexical items. Hewlett states explicitly that, “The input lexicon contains perceptual representations in terms of auditory-perceptual features.”
Realization rules link the input lexicon to the output lexicon, which contains articulatory representations. From there, an articulatory representation can be sent to the motor processor, where a motor plan is assembled using syllabic units. There is an alternative route, however, going through the motor programmer. If a realization rule does not exist, or if there is cause to eschew the realization rule, then the perceptual representation is sent to the motor programmer, where a motor representations is built from scratch. From there, it can either go directly to the motor processing component for implementation, or it can go to the output lexicon for storage, or probably both. Additional levels of production mechanism follow motor processing, including a segmental level of motor processing (which is acquired after the onset of speech), a motor execution level where muscle contractions are planned, and finally the signal sent to the vocal tract, representing the actual articulations.
How well does Hewlett’s model handle the data discussed in my last post? First, lexical avoidance is explained by postulating an entry in the input lexicon that has no corresponding motor plan (Hewlett is unclear here, but I think he means there is no corresponding entry in the output lexicon). Realization rules in which sound contrasts are neutralized (fronting, gliding, etc.) are the result of multiple input lexicon entries being mapped to the same output entry. Improvement in speech accuracy over time is handled by various forms of feedback, including the revision of output lexicon forms by passing input forms through the motor programmer.
There are many positive aspects of Hewlett’s model, and it does improve on the model proposed by Kiparsky and Menn (1977). However, the empirical coverage of the model is still quite limited. Here are a few examples. First, although Hewlett is careful to point out how important phonology is for explaining paradigmatic phonological rules, his model does not include a robust phonological grammar. The input and output lexicons are connected by an arrow, but this obscures what a difficult relationship this must be. How, for example, are output lexical items merged when they remain distinct in the input lexicon (e.g., when the words ‘rock’ and ‘walk’ are pronounced identically, or when /r/ and /w/ are pronounced identically, in general)? What mechanism is responsible for the merger? Notice that previous generative approaches are not helpful here because part of the challenge is to show how the input lexicon–including words like ‘rock’ and ‘walk’–links to the output lexicon–where ‘rock’ and ‘walk’ become merged. Grammars which do not split the lexicon into input and output components are therefore shielded from this problem. Progressive and regressive idioms are also unexplained by the single arrow between the input and output lexicons. The model has no way of explaining why some words might not follow an otherwise consistent grammatical pattern.
Second, how do articulatory representations develop? Consider who a child comes to produce their first word. Based on Hewlett’s model, we can reasonably assume that the child has an accurate perceptual representation of the word in their input lexicon. How is that word then matched up to any motor representation. Presumably, babbling plays some role in the developmental process, but this is not discussed outside of input from the motor programmer. We might look to work by Guenther to solve this problem (e.g., Guenther, 2006), but Hewlett leaves the process unspecified.
Finally, Lise Menn consistently mentions the important of explaining why speech accuracy improves during imitation, but Hewlett’s model is not specific enough to account for this fact.
Overall, Hewlett’s chapter provides an outstanding review of much of the work on child speech production and phonology up to 1990. His model offers several advances compared to similar models proposed by Menn (Kiparsky & Menn, 1977; Menn, 1983), but many facts about speech development remain unexplained.