Guidelines of the International Test Commission for Adapting Tests (van de Vijver & Leung, 1997, and Hambleton, 1999)

The initial guidelines relate to the testing context, as follows.

1. Effects of cultural differences that are not relevant or important to the main purposes of the study should be minimized to the extent possible.

2. The amount of overlap in the constructs in the populations of interest should be assessed.

The following guidelines relate to test translation or test adaptation.

3. Instrument developers/publishers should ensure that the translation/adaptation process takes full account of linguistic and cultural differences among the populations for whom the translated/adapted versions of the instrument are intended.

4. Instrument developers/publishers should provide evidence that the language used in the directions, rubrics, and items themselves as well as in the handbook [is] appropriate for all cultural and language populations for whom the instruments is intended.

5. Instrument developers/publishers should provide evidence that the testing techniques, item formats, test conventions, and procedures are familiar to all intended populations.

6. Instrument developers/publishers should provide evidence that item content and stimulus materials are familiar to all intended populations.

7. Instrument developers/publishers should implement systematic judgmental evidence, both linguistic and psychological, to improve the accuracy of the translation/ adaptation process and compile evidence on the equivalence of all language versions.

8. Instrument developers/publishers should ensure that the data collection design permits the use of appropriate statistical techniques to establish item equivalence between the different language versions of the instrument.

9. Instrument developers/publishers should apply appropriate statistical techniques to (a) establish the equivalence of the different versions of the instrument and (b) identify problematic components or aspects of the instrument which may be inadequate to one or more of the intended populations.

10. Instrument developers/publishers should provide information on the evaluation of validity in all target populations for whom the translated/adapted versions are intended.

11. Instrument developers/publishers should provide statistical evidence of the equivalence of questions for all intended populations.

12. Nonequivalent questions between versions intended for different populations should not be used in preparing a common scale or in comparing these populations.

However, they may be useful in enhancing content validity of scores reported for each population separately. [emphasis in original]

The following guidelines relate to test administration.

13. Instrument developers and administrators should try to anticipate the types of problems that can be expected and take appropriate actions to remedy these problems through the preparation of appropriate materials and instructions.

14. Instrument administrators should be sensitive to a number of factors related to the stimulus materials, administration procedures, and response modes that can moderate the validity of the inferences drawn from the scores.

15. Those aspects of the environment that influence the administration of an instrument should be made as similar as possible across populations for whom the instrument is intended.

16. Instrument administration instructions should be in the source and target languages to minimize the influence of unwanted sources of variation across populations.

17. The instrument manual should specify all aspects of the instrument and its administration that require scrutiny in the application of the instrument in a new cultural context.

18. The administration should be unobtrusive, and the examiner-examinee interaction should be minimized. Explicit rules that are described in the manual for the instrument should be followed.

The final grouping of guidelines relate to documentation that is suggested or required of the test publisher or user.

19. When an instrument is translated/adapted for use in another population, documentation of the changes should be provided, along with evidence of the equivalence.

20. Score differences among samples of populations administered the instrument should not be taken at face value. The researcher has the responsibility to substantiate the differences with other empirical evidence. [emphasis in original]

21. Comparisons across populations can only be made at the level of invariance that has been established for the scale on which scores are reported.

22. The instrument developer should provide specific information on the ways in which the sociocultural and ecological contexts of the populations might affect performance on the instrument and should suggest procedures to account for these effects in the interpretation of results.

