Grasping the nettle: The importance of perception work in listening comprehension
by Richard Cauldwell
Suggestion 4: The Post-Listening phase: the importance of handling speech

What is involved in 'handling' fast speech? When we invite learners to do a reading task, we ask them to inspect sequences of words of varying sizes (paragraphs, clauses, phrases) for evidence to help them complete the tasks we have set. The same should be true for listening tasks: we should ask learners to inspect sequences of words (in speech units of different sizes) for answers to the tasks. However, there are important differences between reading and listening tasks: with the written language perception is not an issue, the words occur and remain for inspection on the page; with the spoken language the words are not available for inspection in the same way, they are available only for inspection in the short-term memory of the learners, and here perception is an issue. Perception - particularly the ability to hold sounds in short term memory long enough to inspect them for meaning - is a skill that is a pre-requisite for understanding.

One feature of any post-listening phase, therefore, is to give learners the experience of handling sequences of speech while inspecting them for clues to understanding. It is therefore necessary for the learners to re-hear and spend time (this may be private, or in discussion with a partner what they hear) with the crucial answer-bearing moments of a recording, and this must be done before the learners see the written transcript, so that the ears are doing the work, not the eyes.

It is vital therefore that the points chosen to be the focus of the listening task should be both central to the 'meaning' of the recording, and challenging in terms of perception. One way of doing this is to select those parts of the recording which are both using software such as 'Motormouth' (Cauldwell & Batchelor, 1999) and 'meaningful'.

At some stage (after an appropriate amount of 'ear-handling') learners should see the written transcript so that they can get feedback on the accuracy or waywardness of their perceptions. This is the point in the listening class when we have the opportunity of actually teaching listening (which Field 1998 argues for): we can help the learners bridge the gap between the known and the unknown, but paradoxically it is the part of a listening comprehension class that is most often omitted, or to which least time is devoted.

Then comes the second vital stage in handling speech, the one that made my learners turn from being 'unhappy' to being 'happy'. This stage involves the learners imitating short, fast, challenging extracts of the recording at the same time and the same speed as the speaker. The teacher chooses an extract and first asks learners to look at a written version and to say it repeatedly to themselves, gradually increasing the speed at which they say it. The teacher then plays the selected extract repeatedly (by skilful use of the rewind button) and the learners try to imitate as accurately as possible the features of the original.

Such extracts should not be long: the longest sequence of words I use for such work lasts just over two seconds and is spoken at 408 words per minute, with two prominent syllables in the places indicated by upper-case letters:

this is ONE i'm going to be looking at in slightly more DEtail in fact

My (advanced level) learners find it an exciting challenge to handle speech in this way, to be able to match native speaker speeds, and I believe it is important to give learners at all levels practice of handling fast speech in the two ways outlined in this section: handling by ear - repeated listening to the fastest meaning-bearing extracts; and handling by speaking - imitating the features of the fastest extracts. It is important to refrain from looking at written versions of the extract too early (ear-handling should precede eye-handling), but it is equally important to inspect written versions of the extracts at some stage.


There was a time when listening comprehension did involve perception exercises (Field 1998), but they have generally disappeared, a fact that Brown describes as 'a quite extraordinary case of throwing the baby out with the bath water' (1990: 145). The emphasis in recent years has been to view listening as an activity which serves other goals. For example White (1987, cited in Anderson & Lynch 1988: 66) found teachers valued listening materials for reasons such as 'good for starting discussions', 'amusing' and 'consolidates language'. Nowhere in White's list of reasons is there recognition of the characteristics unique to fast speech, or of the necessity that listening activities should have listening goals.

The major suggestions therefore, are that we should expand the post-listening phase, and we should abandon the follow-up phase where this takes the focus away from improving listening. In order to do this we need to provide teachers with the skills of observing and explaining the features of fast speech, and provide them with a methodology which helps their learners become comfortable handling and understanding fast speech. Our ignorance of the features of fast speech, our confusion of goals with methodology, have resulted in our avoiding the nettle of fast speech. We need to be bold and grasp this nettle to help our learners become better listeners.


Born in Dublin, educated in England, Richard has taught English in France, Hong Kong, and Japan. Between May 1990, and September 2001, he worked at The University of Birmingham's (UK) English for International Students Unit (EISU). He now works free-lance, continuing his research, and applying the results of this research in teacher training, and classroom materials. His research and teaching centre on spontaneous speech which he attempt to analyse on its own terms – in all its continually varying, stream-like, real-time, contextual glory.

Richard's web site, which contains his research & articles, can be found at:

And he can be contacted at:

