Critiquing Qualitative Research Articles
Mark Firth
Methodology: instruments

Is evidence for validity and reliability clearly presented and adequate? Whilst it is generally accepted in qualitative research that validity can not be achieved, to maximize validity should always remain a goal of the researcher. ‘Validity then, should always be seen as a matter of degree rather than as an absolute state’ Gronlund (1981). The decision to use a combination of these data collection instruments was a useful way to attempt to understand what kind of difficulties students were having. This eliminated the possibility of mono-method bias or a threat to validity where a construct is measured by only one means Trochim (1999). To this end, a reasonable level of depth and scope of learner difficulties could be attained. If the interviews, diary entries and verbalizations were carried out honestly and authentically then these instruments could be seen as some of the most valid and practical for ascertaining what invisible and cognitive problems are occurring.

Reliability in quantitative research is often thought of as consisting of stability, equivalence and internal consistency Cohen et al. (2000:p.117). Within the qualitative paradigms however, the canons of positive reliability concerns are viewed at the very least debatable and to the greatest extent - irrelevant. This does not in any way however take away from the pursuit of reliable and accurate research; but rather throws a light back on to the concerns of validity and how effectively the researcher has encapsulated the truth about the way a culture is behaving at a given moment and the ramifications for this knowledge. For this reason ‘reliability’ is often construed by qualitative researchers (Guba & Lincoln, 1985) to be referred to as being ‘dependability’. The article by Goh refers to few measures taken to ensure dependability. For example, we don’t know how used to writing reflective journals students were nor do we have any details about how the small group interviews were conducted. More reliability issues are discussed below.

Is there a clear description of the instrument and how it was used? The article does not explicitly give details about what the students were asked write about in their listening diaries. We assume transcriptions of the interviews were made by the references made in the data analysis but we were not informed clearly enough of the procedures that the researcher planned to undertake in the methods section.

Similarly, what was to be done with the data is not mentioned until later in the article i.e. the way in which labeling was done before categorization. Often qualitative research is seen as a process of going backwards and forwards from the data to the design as the data reveals concepts to the researcher. In this case however, by the nature of the pre-operationalized design, the researcher clearly envisaged what she was going to do and could have detailed it here. Finally, no description of the verbalisation procedure developed by Ericson & Simon (1987,1993) is given. If these principles for collecting verbal data could be verified as indeed being reliable, so too could the results and interpretations of this research be viewed as more stable.

Is there a clear description of how the instrument was administered? The lack of details concerning the administration of the diaries, interviews and verbalization process could be seen as the greatest weakness of this study. It would have been better to have a full description of who conducted them and the duration of each instrument. When reading this article, questions that come to mind which need answering include:

  • What instructions and guidelines were students given for completing diaries?
  • What was the nature of the listening texts? Were they monologues, dialogues, varieties of tapes, videos and live conversations? What was the degree of difficulty?
  • How much were students required to write and were they given any assistance in spelling, wording or writing the diaries?
  • Who conducted the small group interviews and were they trained for doing so?
  • Was any credit given to students for participating in the interviews or verbalisation procedure?
  • How many interview sessions and verbalisation sessions took place in turn producing how much data?

Is it likely that subjects would fake their responses? One of the strengths of this study is the reliability concerning trying to understand learner difficulties. In other words there is no immediate reason to see why respondents would feel a need to ‘perform’ or give politically desirable information, as there is a perceived mutual benefit for researcher and participant alike.

Are interviewers and observers trained? This is a critical issue to ensure consistency and reliability of data. Punch (ibid:p.175) contends that interviewers require training to maintain a common approach for the questions and techniques to be in line with each other. Since we don’t know how many interviewers there were we don’t know about their competencies.

Methodology: procedures

Are there any clear weaknesses in the design of the study? Paradoxically the strength of the study could be viewed as being one of its weaknesses. For the intensive purpose of gathering reflective data for to super impose a cognitive processing classification model discounts the data to speak for itself in any other way. It would be interesting to use a different coding system and to try not to determine any pre-conceived criteria for classification. This does not threat the validity of the study in any way but rather indicates the extent to which a holistic approach wasn’t undertaken.

Are the procedures for collecting information described fully? It would have been helpful to have more detailed information about what kinds of questions made up the ‘semi-structured interviews’. We can only suppose that the questions were not overly suggestive or leading student responses. Despite this, for replicability in future studies, or just for reference, this would have been beneficial for professionals to know about the kinds of questions they can ask learners having difficulties.

Is it likely that the researcher is biased? In this article Goh mentions some safeguards which she took against bias, namely, the use of colleagues to triangulate classifications of learner problems. While there is no possibility of observer bias effecting the results through the use of transcripts some misinterpretation of the data in the coding is possible. This concerns the fact that the learners were required to carry out an examination of their beliefs in such a metacognitive manner - which requires extensive training as contended by Kohonen, Jaatinen & Lehtovaara (2001) and recommended in the discussion section of this study by Goh herself.

Ironically the area of study (understanding meaning in a second language) in which this investigation takes place is also the very concern for interpreting the data. Comprehension processes is the side of psycholinguistics which looks at discourse analysis or put simply how we make sense of texts Finch (2000:p199). According to Widdowson (1993) schema theory operates on two levels: systemic – the phonological, morphological and syntactic components of language; and schematic – our background knowledge. For comprehension of language, a match is required between the encoded systemic text and our own schematic knowledge.

In sum, difficulties may have arisen at any of the following stages of communication:

  • When students were asked to explain the difficulties they had in listening they may not have fully understood what they were supposed to respond.
  • When students voiced their various difficulties they may have easily had trouble verbalizing such high order issues in a second language.
  • When the researcher analyzed the data we rely heavily upon her (maybe plus one or two colleagues’) schema for accurate interpretation.

It may have been useful to allow learners to report their difficulties in Chinese and English for a richer conceptualization. We are also not told if the comments that were made in Chinese were translated and recorded, after all, these may have been the most indicative verbalizations of the listening problems faced.

