A third important distinction concerns the type of method considered. Generally, one can distinguish between interchangeable methods and methods that are structurally different (Kenny, 1995). Interchangeable methods cannot be distinguished with respect to psychological criteria. An example could be students who are randomly chosen from the classes of different teachers to provide a rating of the teaching quality. In this case, there is no structural difference between the students. All students have more or less the same access to the teacher's behavior, and it does not really matter who rates the teacher. Interchangeable and randomly selected raters are typically used if one is interested in (a) measuring a trait (e.g., teaching ability) and (b) estimating the precision with which this trait can be measured on the basis of multiple ratings (convergent validity).

The situation is quite different if one asks, for example, the teacher him- or herself, a student, and the principal of the school to rate the teaching quality of the teacher. In this case, the three raters are structurally different because they are not randomly chosen from the same set of possible raters. Whereas in the case of multiple student ratings it is reasonable to assume that the different students have the same access to the teacher's behavior, this is quite different with the teacher, the principal, and the student rating. Because the raters have different perspectives, it might be more interesting to contrast the ratings and to explain the differences between them. For example, it would be interesting to find out why the ratings of the principal and the student might differ from the self-report. Whereas the mean value of randomly chosen students is a reasonable measure of the teacher's quality of teaching (as the average opinion of the students), this is not necessarily the case for structurally different raters if one does not know why the ratings differ. If the principal has never visited the teacher while he or she was teaching, one might hesitate to define the teaching quality as the mean of the three ratings. Nevertheless, it would be interesting to analyze why the principal's view differs from the teacher's and the student's view to learn more about principals' subjective theory of teacher qualities. Along a similar vein, it would be interesting to examine the differences in the views of the teacher and the student and not to just simply aggregate the two ratings to diminish method specificity.

The distinction between randomly selected (interchangeable) and structurally different raters is quite similar to the distinction between random and fixed factors in the analysis of variance (e.g., Hays, 1994). In the case of random factors, the different groups (methods) of a factor are considered as randomly chosen from a population, and the researcher aims to estimate the variation of the factor. In the fixed effect model of analysis of variance, the aim is to analyze the effect of different groups and to contrast them. Hence, the concept of interchangeable versus structurally different methods has consequences for the choice of a methodological approach.

