Grasping the nettle: The importance of perception work in listening comprehension
by Richard Cauldwell

A common complaint from learners on first visiting an English-speaking country is that their listening skills cannot cope with fast spontaneous speech. Four inadequacies in the teaching of listening lead to this complaint: we rely too much on first language research findings; we neglect perception; we give learners easy and enjoyable, rather than challenging tasks; we use listening activities to serve other language-learning goals. I propose four things: that teachers themselves engage in classroom research in second language listening; that teachers should be provided with the skills of observing and explaining the features of fast speech; that teachers should be prepared for students to be challenged (even frustrated) in the early parts of a listening lesson; that the post-listening phase should be expanded to include aural and oral 'handling' of crucial fast extracts from recordings to improve students' perception skills

Learners, teachers, teacher trainers and university researchers have been stung by casual contact with the nettle of fast spontaneous speech, and have tended to avoid further contact. The legacy of this avoidance includes four problems for the effective teaching of listening. I shall first describe the four problems; I then suggest ways in which we can improve our teaching of listening. In doing so, I shall make reference to the standard listening comprehension class, with four phases: warm-up, while listening, post-listening, and follow-up.

Listening comprehension methodology of the last two decades has been characterised by systematic avoidance of the painful fact that fast spontaneous speech is difficult for learners. We avoid confronting this fact in four ways: we place too much faith in first language research; we rely on, but refuse to develop, learners' perception skills; we focus on what learners can manage, rather than on what they have to master; and we favour follow-up activities such as discussions and writing tasks rather than teaching listening.

Problem 1: Too much faith in first language research

Fourteen years ago, Anderson and Lynch (1988: 21) noted that there was very little research into listening in a second language. Because of this gap in research, applied linguists, textbook writers, and teacher trainers have gone to research in first language listening for guidance. As a result, listening comprehension exercises are greatly (and in my view inappropriately) influenced by what is known about successful first language listening.

First language research has established that successful listening is characterised by:

• listening for a purpose
• making predictions based on contextual information
• making guesses when things aren't clear
• inferring what is meant where necessary
• not listening ('straining') for every word

(adapted from Brown 1990: 148)

Teacher trainers and textbook writers have made appropriate use of some of these findings, and inappropriate use of others. In particular they have taken the last of these points ('they don't listen for every word') and have made it an article of faith. They advocate 'top-down' activities and urge the avoidance of any activity which could be characterised as 'bottom-up'. Of course, we should be careful about this particular issue: we don't want learners to strain so much to hear every word that they cannot understand anything. In my view though, it is a mistake to abandon, as we have, bottom-up activities which introduce learners to the essential characteristics of speech.

From first language research comes the teacher's standard advice in a listening lesson: 'You won't be able to understand every word, and you don't need to'. I find this explanation illogical: the 'reasoning' goes something like this:

1. non-natives don't understand
2. natives understand without paying attention to every word
3. therefore, in order to understand, non-natives should not try to pay attention to every word

The first statement describes the problem which all listening classes address in some way; the second is a research finding; the third is the false deduction. It is not reasonable to deduce from the first two statements that 'improvement in listening skills follows from not trying to pay attention to every word'. In acting (as we do) on this illogical deduction, we confuse goals and methodology: we require learners to simulate the goal of native listener behaviour instead of teaching learners how to acquire progressively native-like abilities in perception and understanding. We have made the mistake of allowing the goal to become the method: we should recognise that the skill of understanding without attending to every word is a goal to be reached, not a means of getting there.

Adopting the goal-as-method procedure conveniently allows us to ignore the fact that native speaker listeners have great advantages over non-natives particularly in terms of perceptual ability, it allows us to avoid grasping the nettle of fast speech. Activities which encourage bottom-up processing, which target learners perceptual abilities, have become taboo

Problem 2: Too much hope in listening out for 'stresses'

Listening exercises are also characterised by the hope which often appears in the following words of encouragement: 'Just listen to the stresses, they'll be in the most important words, then you'll understand'.

There are three problems with this view: first, very often, 'important' words such as negatives are often unstressed, and so-called 'unimportant' grammatical words such as prepositions and pronouns are stressed; second, research indicates that it is difficult to pick out stressed words in a language which is not your own (c.f. Roach, 1982); third, the concept of stress is loosely defined and fails to distinguish between word-level stress, and stresses associated with higher order phenomena such as tone units.

Problem 3: Too much help

Although many listening comprehension recordings boast that they are 'natural', few of them are truly so. Many (though not all) are scripted and artificially slow. The reasons for this can be found in statements such as the following from Penny Ur:

Students may learn best from listening to speech which, while not entirely authentic, is an approximation to the real thing, and is planned to take into account the learners' level of ability and particular difficulties. (Ur, 1984: 23)

I myself find nothing wrong in what Penny Ur says here but I would argue that listening comprehension materials are often over-charitable in leaning towards 'the learners' level of ability' and not taking account of the level of ability required to understand spontaneous fast speech. The gap between the learners' level and the target level (fast spontaneous speech) is a gap that we as teachers and materials writers must help learners bridge. But we cannot help them bridge this gap if we continue with our charitable focus on what learners can manage at their current level.

In recent years, listening materials in main course textbooks at upper-intermediate and advanced levels have featured spontaneous speech, and this move is a good one. However, the methodology (crudely, give the answers, and move on) has remained much the same, and teachers are not trained to explain what the features of fast spontaneous speech are.

We have to help learners cope with speech which is above their current level, and to arrive at a description of 'above current level', we need a description of the topmost level - a description of the features of 'difficult' (fast spontaneous) speech. We need such a description for use in teaching so that we can have an equal focus on both where our learners are, and where they have to get to: this description should form part of teacher training - it should be part of every teacher's tool-kit.

Problem 4: Rushing to the follow-up

We offer too little help in the post-listening phase. My impression is (and this is backed up by research by Field 1998) that of the four phases of a listening lesson it is the post-listening phase which has the least amount of time devoted to it. The first - warm-up - phase (with contextualisation and personalisation) and the fourth - follow-up - phase (often a discussion or writing task) have the most time devoted to them. It is at this point that avoidance is at its most obvious worst, and the reasons for it can be found in the standard training of communicative language teachers.

Our training predisposes us to obey a communicative imperative which demands rapid movement to the next activity to keep the variety, interest, and motivation high: we are anxious to see and hear learners enjoying social interaction in English. We prefer this high level of social 'buzz' to staying with and helping learners through the difficulties of a recording: when there might be silent private struggles to perceive and understand the acoustic blur of speech.



In the description of problems a number of themes emerged: research into L2 listening, teacher training, grasping the nettle, and methodology.

Suggestion 1: Research into L2 listening in the classroom

Fry(1) (personal communication) advises, where circumstances permit, allowing learners to control the tape-recorder so that they can work on, and re-hear, those passages of the recording that they have problems with. Fry's experience is with classes of adult learners of English: he divides the class up into small groups and, after having done the warm up phase and set the listening task, he gives each group a tape-recorder, and the tape, and leaves it to the group to control the tape-recorder. He reports being very surprised at what they found easy, and what they find difficult in listening.

My experience of working with learners with computer controlled access to recordings (reported in part in Cauldwell, 1996) is also one in which I learned a great deal about their powers and weaknesses in perception and understanding. It brought home to me the fact that their difficulties lay in what were for me 'surprising' places.

So there are two benefits to allowing learners to control of the tape recorder: they can focus on their own needs; and for the teachers it amounts to research into second language listening - teachers discover where gaps in understanding and perception lie.

Suggestion 2: A fast speech phonology

Teachers should be trained in 'observing' speech, and particularly the authentic speech that now is a feature of many listening comprehension and general textbooks. This training does not currently take place. The training they get is in the area of fixed position phonology for the teaching of pronunciation. This training is typically concerned with the articulation of minimal pairs of consonants and vowels so that teachers can explain to learners how they can improve their pronunciation.

But these current approaches to 'phonology for pronunciation' do not give adequate preparation for dealing with the features of authentic fast speech, not even in the areas where they might be thought to do so: elision, assimilation, sentence stress, and intonation. The 'rules of speech' presented in such materials are derived from introspection concerning how decontextualised written sentences might be read aloud. These 'rules of speech' are inadequate to account for what happens in fast spontaneous speech.

There is therefore a need for a 'fast speech phonology' which prepares teachers to observe and explain the variability of fast speech. A major element of this training would be to encourage teachers to rid their minds of the expectations and rules they have inherited from fixed position phonology. As for what else might be included, Field (1998: 13) suggests features such as 'hesitations, stuttering, false starts, and long, loosely structured sentences'. To this list one can add all the features of speech described in Brazil (1994; 1997) - prominences, tone units of different sizes, tones, pitch height. One can also add the differences between citation and running forms of words, turn taking, accent, voice quality, and the effects of speed on speech.

Suggestion 3: Grasping the nettle

Learners will claim that fast speech is too difficult for them: and teachers will naturally feel tempted to give them easier, slower, scripted materials that they feel comfortable with. If this solution is adopted however, learners will be under-prepared to cope with the fast spontaneous speech that will come their way when they meet native speakers of English.

It is necessary to allow learners to feel challenged, and it may be necessary for them to feel frustrated by the demands of the listening task. I took a survey of one class of seven advanced learners of English (teachers of English from Japan) at the moment when they were deeply immersed in a difficult recording, and attempting to answer questions relating to the recording. I asked them to score their feelings on a five point scale with 'A' as 'happy' and 'E' 'unhappy'. Some time later, after doing the post-listening exercises I asked them to make judgments on the same scale. The results are shown in Table 1.

Table 1 Survey of learners' feelings before and after post-listening activities

  Happy               Unhappy

Table 1 shows that there was a major shift in feeling between the end of the while listening phase and the end of the post-listening phase: learners moved from being broadly 'unhappy' to broadly 'happy'. (The means by which this change was brought about will be described in the next section.) Here, it is important to note that it is vital for teachers to be prepared for periods of learner frustration, and to have the methodological training and knowledge base to help learners through periods of discomfort and frustration to increasingly sophisticated levels of perception and understanding. If the goal is to help learners become better listeners, it is vital that they learn to be comfortable handling fast speech.

(1) John Fry of the British Council, Hong Kong


Suggestion 4: The Post-Listening phase: the importance of handling speech

What is involved in 'handling' fast speech? When we invite learners to do a reading task, we ask them to inspect sequences of words of varying sizes (paragraphs, clauses, phrases) for evidence to help them complete the tasks we have set. The same should be true for listening tasks: we should ask learners to inspect sequences of words (in speech units of different sizes) for answers to the tasks. However, there are important differences between reading and listening tasks: with the written language perception is not an issue, the words occur and remain for inspection on the page; with the spoken language the words are not available for inspection in the same way, they are available only for inspection in the short-term memory of the learners, and here perception is an issue. Perception - particularly the ability to hold sounds in short term memory long enough to inspect them for meaning - is a skill that is a pre-requisite for understanding.

One feature of any post-listening phase, therefore, is to give learners the experience of handling sequences of speech while inspecting them for clues to understanding. It is therefore necessary for the learners to re-hear and spend time (this may be private, or in discussion with a partner what they hear) with the crucial answer-bearing moments of a recording, and this must be done before the learners see the written transcript, so that the ears are doing the work, not the eyes.

It is vital therefore that the points chosen to be the focus of the listening task should be both central to the 'meaning' of the recording, and challenging in terms of perception. One way of doing this is to select those parts of the recording which are both using software such as 'Motormouth' (Cauldwell & Batchelor, 1999) and 'meaningful'.

At some stage (after an appropriate amount of 'ear-handling') learners should see the written transcript so that they can get feedback on the accuracy or waywardness of their perceptions. This is the point in the listening class when we have the opportunity of actually teaching listening (which Field 1998 argues for): we can help the learners bridge the gap between the known and the unknown, but paradoxically it is the part of a listening comprehension class that is most often omitted, or to which least time is devoted.

Then comes the second vital stage in handling speech, the one that made my learners turn from being 'unhappy' to being 'happy'. This stage involves the learners imitating short, fast, challenging extracts of the recording at the same time and the same speed as the speaker. The teacher chooses an extract and first asks learners to look at a written version and to say it repeatedly to themselves, gradually increasing the speed at which they say it. The teacher then plays the selected extract repeatedly (by skilful use of the rewind button) and the learners try to imitate as accurately as possible the features of the original.

Such extracts should not be long: the longest sequence of words I use for such work lasts just over two seconds and is spoken at 408 words per minute, with two prominent syllables in the places indicated by upper-case letters:

this is ONE i'm going to be looking at in slightly more DEtail in fact

My (advanced level) learners find it an exciting challenge to handle speech in this way, to be able to match native speaker speeds, and I believe it is important to give learners at all levels practice of handling fast speech in the two ways outlined in this section: handling by ear - repeated listening to the fastest meaning-bearing extracts; and handling by speaking - imitating the features of the fastest extracts. It is important to refrain from looking at written versions of the extract too early (ear-handling should precede eye-handling), but it is equally important to inspect written versions of the extracts at some stage.


There was a time when listening comprehension did involve perception exercises (Field 1998), but they have generally disappeared, a fact that Brown describes as 'a quite extraordinary case of throwing the baby out with the bath water' (1990: 145). The emphasis in recent years has been to view listening as an activity which serves other goals. For example White (1987, cited in Anderson & Lynch 1988: 66) found teachers valued listening materials for reasons such as 'good for starting discussions', 'amusing' and 'consolidates language'. Nowhere in White's list of reasons is there recognition of the characteristics unique to fast speech, or of the necessity that listening activities should have listening goals.

The major suggestions therefore, are that we should expand the post-listening phase, and we should abandon the follow-up phase where this takes the focus away from improving listening. In order to do this we need to provide teachers with the skills of observing and explaining the features of fast speech, and provide them with a methodology which helps their learners become comfortable handling and understanding fast speech. Our ignorance of the features of fast speech, our confusion of goals with methodology, have resulted in our avoiding the nettle of fast speech. We need to be bold and grasp this nettle to help our learners become better listeners.


Anderson, A. and Lynch, T. (1988) Listening. Oxford: Oxford University Press.
Brown, G. (1990) Listening to Spoken English. [Second Edition]. Harlow: Longman.
Brazil, D. (1994) Pronunciation for Advanced Learners of English. Cambridge: Cambridge University Press.
Brazil, D. (1997) The Communicative Value of Intonation in English. [Second Edition]. Cambridge: Cambridge University Press.
Cauldwell, R.T. (1996) 'Direct Encounters with Fast Speech on CD Audio to Teach Listening'. System 24/4: 521-528
Cauldwell, R.T. and Batchelor, T. (1999) 'Mr. Motormouth'. [Multimedia Toolbook software under development]. Birmingham: The University of Birmingham.
Field, J. (1998) 'The Changing Face of Listening'. English Teaching Professional 6: 12-14.
Roach, P. (1982) On the Distinction between 'Stress-timed' and 'Syllable-timed' languages. In D. Crystal (ed.) Linguistic controversies, Essays in linguistic theory and practice. (73-79). London: Edward Arnold.
Ur, P. (1984) Teaching Listening Comprehension. Cambridge: Cambridge University Press.
White, G. (1987) 'The Teaching of Listening Comprehension to Learners of English as a Foreign Language: A Survey'. Unpublished M. Litt. Dissertation, The University of Edinburgh.


Born in Dublin, educated in England, Richard has taught English in France, Hong Kong, and Japan. Between May 1990, and September 2001, he worked at The University of Birmingham's (UK) English for International Students Unit (EISU). He now works free-lance, continuing his research, and applying the results of this research in teacher training, and classroom materials. His research and teaching centre on spontaneous speech which he attempt to analyse on its own terms – in all its continually varying, stream-like, real-time, contextual glory.

Richard's web site, which contains his research & articles, can be found at:

And he can be contacted at:

Back to the original article

Back to the articles index

Tips & Newsletter Sign up —  Current Tip —  Past Tips 
Train with us — Online Development Courses — Lesson Plan Index 
Phonology —  English-To-Go Lesson  Articles Books
 Links —  Contact — Advertising — Web Hosting — Front page

Copyright 2000-2014© Developing