Methodology Illustrations For Translated Tests

The Minnesota Multiphasic Personality Inventory—2 (MMPI-2) has 150 translations and is available in 22 of these languages for assessment of psychopa-thology. Butcher et al. (1998) outlined procedures to attain internal relevance of the MMPI-2 for a culture, as well as cross-cultural equivalence. Translation was conducted in a manner to promote linguistic equivalence using multiple bilingual translators, with a requirement that translators must have lived in a country for 5 years, or demonstrated equivalent experience. Further, at least two translators were involved independently translating from English to the target language, or a committee of professionals independently translated, and then collaborated and discussed items until the best and most socially appropriate of the items were selected for inclusion. Back translation of test items was conducted until all problematic items were satisfactorily translated. Following back translation, the test publisher utilized an independent center to evaluate each translation for accuracy and readability. Bilingual test-retest studies were conducted, similar to those conducted for determining test equivalence of alternative forms of a test or reliability in a test construction design. Statistical analyses were conducted to evaluate for item equivalence, translation equivalence, and measurement equivalence. Factor analysis was conducted for examination of construct validity. Criterion validity of the translated test in the new culture was studied to determine if the test was operating clinically in the target culture in the same manner as in the United States. Normative validity was also assessed to determine if American norms can be used or adapted, or if new norms for the target culture must be developed.

Another scale, the Mattis Dementia Rating Scale, has recently been developed for use with Spanish-speaking elderly (Arnold, Cuellar, & Guzman, 1998). This Spanish revision for use across cultures involved adaptation of the test for linguistic equivalence, using several procedures discussed. Linguistic equivalence was established using a translation-back translation procedure. Internal consistency of the adaptation was studied, along with normative equivalence and clinical equivalence of the Spanish version. The concept of clinical equivalence and the utility of the revised instrument in differentiating impaired and nonimpaired individuals was investigated.

In another study, several tests from the Halstead-Reitan neuropsychological test battery were investigated in three linguistically distinct samples: English, bilingual English-Spanish, and Spanish (Arnold et al., 1994). These groups had been composed based on level of Mexican American acculturation to produce culturally distinctive groups. In addition to showing significant group differences based on acculturation level, results suggested that adjustments based on acculturation level could prove clinically useful by improving diagnostic classification and accuracy of information provided by some of the instruments, particularly the Category Test.

A Spanish language version of the Strong Campbell Interest Inventory (SCII-S) was developed using back translation, bilingual field testing, and independent expert opinion to evaluate it. The construct validity of the SCII-S was then explored with both the English and Spanish versions of the Strong Campbell administered to one bilingual group of high school students. A confirmatory factor analysis was then applied to determine if a common factor structure existed for the two versions of the test, with subsequent documentation of both convergent and divergent validity for method and trait factors measured by this test (Fouad, Cudeck, & Hansen, 1984). Development of this patient satisfaction scale began with utilization of the translation-back translation method. Additionally, a decentering technique, with adjustments to either English, or Spanish, or both versions of the scale was completed in order to produce linguistically equivalent versions for cross-cultural use.

Decentering involves enhancement of the readability of both the original and translated instrument by adjusting both as needed (Werner & Campbell, 1970). Reliability and validity coefficients were produced for both English and Spanish versions of the decentered scale. Item and scale score distributions were analyzed using two methods of response dichotomization to compare for differences. Results showed both versions to be reasonably reliable and valid, with the Spanish version less reliable and valid. Problems with the Spanish version of the scale appeared to be associated with response tendencies observed for Spanish-speaking individuals, who tended to respond "good" to items more frequently than their English-speaking counterparts. The important issue of culturally produced response tendencies is referred to in more depth in the prior section focusing on technical equivalence. The MMPI, for example, has been shown to display L-scale elevations that may reflect cultural, rather than clinical test-taking variables (Montgomery & Orozco, 1985). Response format and dichotomization of responses were viewed as important areas to address in developing similar test translations (Hayes & Baker, 1998).

Described as one of the most carefully developed measures of job satisfaction, the Job Description Index scale (Campbell, 1970) began with a target-language translation, followed by back translation to the source language. Measurement equivalence was established with three different groups, item bias analysis completed, and relational equivalence found (Drasgow & Hulin, 1987). Translation effects were observed in comparative analysis. When one culture's population possessed bilingual abilities, translation effects were observed within the culture. Greater differences were found across cultures where test takers spoke the same language (Hulin, 1987).

In a unique study to construct a dementia screening measure for use in two groups with different cultural and linguistic identities, Hall et al. (1993) attempted to develop an instrument independent of culture and language for use with consumers and informants. The approach is described as similar to that used for the neuropsychological battery developed by the World Health Organization (WHO; Maj et al., 1991). This study promoted the use of harmonization (WHO, 1990, as reported in Maj et al., 1991), which indicates that the instrument must be consistent with the cultural, linguistic, and educational norms of the targeted cultural group. Prior to the typical translation-back translation protocol, the study began with identification of cognitive and behavioral dimensions to be measured. The dimensions were selected to be consistent with current diagnostic criteria for dementia. The relevance of each dimension to the target culture was reviewed and discussed by an interdisciplinary, culturally competent team, with draft questions constructed for the target culture. Two independent translators and the interdisciplinary team, who reviewed each item in a manner to promote harmonization, conducted the translation-back translation process. A pretest was conducted in the two culturally distinct groups for acceptability, reliability, and validity. Pretest data were also analyzed both for discriminant function and to determine a cutoff score for dementia. A subsequent community survey and clinical assessments were completed to, again, determine comparability, reliability, and validity, in addition to estimate prevalence rates, sensitivity, and specificity of the new instrument.

Was this article helpful?

0 0

Post a comment