Tag Archives: Gathercole

Baddeley’s working memory may not play a role in speech production

Gathercole, S., & Baddeley, A. (1993). Working memory and language (pp. 161-175). Hillsdale, NJ: Lawrence Erlsbaum.

Gathercole and Baddeley argue that working memory might play two roles in speech production.

Working memory in a between-processes role.
Working memory might serve as a buffer for output. If speech production is a stage-based process, with different levels of representation of language content (message, function, position, sound), working memory may hold the output from processing at one level so that it can be picked up as input for processing at another.

Is the phonological loop identical with a motor planning buffer? Probably not. In studies of speech-onset timing, there is little difference between simple reaction time experiments–where a participant is told a target word to say, and then presented with a cue to say it– and experiments where they are asked to fill the time intervening between target word and cue with nonsense syllables. If the phonological loop were a working space for motor planning, rehearsing the nonsense sylalbles should impede the production of the target word. It is possible that we construct rather lengthy motor programs (above the word or stress group level), by combining sub programs in some working space, but that space may not be the phonological loop.

Some correlational evidence in neurophysiology might speak to a connection between the phonological loop and a motor planning buffer: e.g. Broca’s aphasics show deficits in fluent speech and in short term memory. Importantly, though, there seems a quite demonstrable dissociation: there are patients with relatively profound loss of short-term memory skills, but who, nonetheless, retain fluent speech planning.

It may be that, while the phonological loop is not necessary as a buffer for output from speech planning systems, it may well be that the same processes that underly articulation underlie subvocal rehearsal processes (repeating things to oneself), and might show correlated effects from damage or interference.

Working memory in a within processes role.
Alternatively, or perhaps additionally, working memory might serve a _within-processes_ role. For instance, “to achieve a positional level representation, the phonological specifications of the main lexical items must then be retrieved, a planning frame for the sentence constructed, the words inserted into the planning frame, and the appropriate affixes and function words assessed and inserted.” All of this activity requires interactivity between the cognitive “thing” that will eventually be the utterance, and all this stuff in long-term memory (meanings of words, syntactic structures appropriate to the language, etc.). Working memory may serve as the workshop for all of this cognitive processing. The central executive may be most at work here.

Loading the central executive with tasks (like remembering 6 digits while also trying to form interesting, grammatical sentences) results in speech that is more stereotyped and predictable than when the central executive is less loaded with activity. The effect is small, and present only in relatively difficult conditions (e.g. 6 digit task) rather than in easier condition that might be handled solely by the phonological loop (a 3 digit task). More work needs to be done here.

Developmental trajectories.
Though the phonological loop may not be synonymous with the motor buffer for speech production in adults, it is possible that it plays a greater role in the speech production of children. Similarly, the automaticity of much of adult speech, and it’s relative independence from other cogitive and physically demanding activty, may suggest a relatively minor role of the central executive in production. However, in children who haven’t yet the levels of practice to automate the task of speaking, the central executive may be more involved, and speech less independent of other tasks.

My questions

  1. Would we expect deaf children (particularly of deaf parents) to have a “cheremological loop?” Would we expect differences (in function, capacity, etc.) in such a system? Would it be different still from the visuo-spatial sketchpad?
  2. Does the evidence for dissociation between perception/memory tasks and speech production, which Gathercole believes fails to support a meaningful connection between working memory processes and a buffer for motor planning activities, pose similar challenges for a connectionist model of production? A symbolic approach to mental functioning would seem quite amenable to this idea of dissociation–my idea of a “chair” need not be tied to any particular acquisition event or any particular type of behavioral output (including speech) related to chairs. In a connectionist model, wouldn’t we anticipate more of a connection — if my idea of “chair” is really a certain mental state of shared activation across input, output, and hidden nodes, wouldn’t we predict more of a relationship between perception, memory, and production?
  3. In the first paragraph of the “Speeded Speech Production” section, Gathercole discusses the simple reaction time and choice reaction time tests. This may be a problem with her paraphrase, rather than a design problem in the experiments, but in the simple reaction time experiment participants were “told” (one would believe via speech) the target word. In the choice reaction time experiment, participants read words visually. Could some of the syllable-length effects be due to the fact that participants who are told a word are given a pretty good model of how to articulate that word, while reading requires a bit more translation into an appropriate motor plan?

Gathercole & Baddeley: Introduction to working memory

Gathercole, S., & Baddeley, A. (1993). Working memory and language (pp. 1-12). Hillsdale, NJ: Lawrence Erlsbaum.

Working memory, here, is “the short-term memory system, which is involved in the temporary processing and storage of information.” Baddeley’s model is a “resources” model (as opposed to a discrete slots model, or a decay model, or an interference model) of short-term memory.

The working memory model proposed is tripartite. The first component, a central executive, monitors two slave systems: a phonological loop and a visuo-spatial sketchpad. (This latter is little involved in speech perception, so I will ignore it here.)

The central executive is involved in selective control of action, planning, coordination of tasks, possibly consciousness. This executive might be a unitary process, or it might be several cooperative subprocesses. Tasks that require the inhibition of prepotent responses in favor of more novel responses would seem to involve the central executive.

The authors propose that the phonological loop has two processes. The first of these is a passive buffer of sorts, that takes in phonological information from the environment. This buffer is subject to word length effects: the more syllables in a set of target words, the more difficult those targets are to remember, possibly because the ribbon of the loop is too short to capture them all. The buffer is also subject to articulatory suppression effects–when we are prohibited from beginning to subvocally rehearsing, retention suffers.

The second process is an articulatory rehearsal process. This process is subject to phonological similarity effects: when target words share phonological characters, they are more difficult to remember. It is subjct to irrelevant speech effects, too. When non-target words share similar phonology–independent of whether they might share semantic or lexical similarity–they interfer more readily with retention of target words.

Some of my questions

  1. Do fast talkers routinely test better on working memory than slow talkers do?
  2. The authors offer up as evidence of capacity limits for the phonological loop that longer words (more syllables and/or more time necessary to articulate) results in lower recall scores. Is word length confounded with corpus frequency? Do the effects remain when controlling for distributional differences? [2b] In visual attention/working memory research, we see that sometimes what’s touted as resource-driven, qualitative effects (e.g. lower memory span for complex objects than for simple objects; as in Alvarez & Cavanaugh, 2004), can be explained in favor of a simple, relatively high fidelity “slot” model, where the object is stored well (complex or not), but where participants are just really bad at making comparisons (Awh, Barton, & Vogel, 2007). I wonder if something similar might occur in the phonological loop?
  3. “The probability of losing a phonological feature which discriminates the item form other members of the memory set will be greatest when the number of discriminating features is smallest.” Two questions about this:
    1. If we’re considering counts of features, this would seem to make sense. However, what about a different dimension, like temporal extent of features? Would we expect a length-limited “tape” in the phonological loop to record with high-fidelity features that have relatively short duration, and perhaps to suffer when important information spans time longer than the tape?
    2. In aggregate, it might be true that items in a set discriminated by n features are less likely to be remembered than items discriminated by n + 1. However, shouldn’t we expect that features enjoy different weighting, and that this cannot be a simple, linear, additive model? For some reason, I’m thinking of the Family Guy use of the utterance “Cool Whip” with the initial consonant in whip oddly aspirated. (Here’s where my ignorance of linguistics starts to show.) In English, aspirated consonants are not generally contrastive (right?), but might we expect sometimes that violations of expectations laid down by the distribution of our experience to be quite marked, and notable even if it’s just a single feature? Similarly, if one lives in a society where post-vocalic /r/ is some marker of group membership, or status, or whatever, might we expect this feature to carry more weight than some other, more culturally neutral, single feature?