ficulty of the Rasch model is 0.36, so that the task difficulty is very much the same in both models. This does not have to be the case for all items. If the estimates from both models were approximately the same, they would be positioned on the diagonal in Figure 18.3. This would mean that no interactions between content and cognitive components exist.

In fact, Figure 18.3 shows that the congruence of the task difficulties from both models is not very high. As expected from the difference of the log likelihoods, the item parameters of the Rasch model and those calculated from the item components of the LLTM differ from each other substantially. The hypothesis of a simple additive model of content and cognitive component does not hold for the German PISA science test.

However, the reliability of the tests of both (unidimensional) models does not reflect the lack of fit of the main effects model. The reliability, estimated as the ratio of latent and WLE variance, is .812 for the Rasch model and .777 for the LLTM. The superiority of the Rasch model is better reflected by its higher variance (1.09) than that of the LLTM (.81).

In Section 2, the distinction between difficulty and ability models was introduced. According to this distinction, both models considered so far (the LLTM and Rasch model) are difficulty models. With regard to the present data, the impact of the methods on the response probabilities certainly cannot be modeled by a difficulty model. Maybe an ability model based on the same distinction between content and cognitive components is more appropriate for this data. The assumption that the methods refer to own latent dimensions is rather plausible for the PISA science test example because what is called

cognitive components may also be considered as cognitive competencies, that is, as trait variables.

This leads us to the multidimensional multi-method Model (8), where latent variables are assumed for each method and difficulty parameters for each content. The model may be considered to be a multidimensional generalization of the LLTM, and it has a log likelihood of-14,138, which is notably better than the log likelihood of the ordinary LLTM. A total of 38 independent parameters have to be estimated: 10 content parameters, 7 latent variances, and 21 latent covariances.

The difficulty parameters of the items (content) are strongly related in both the unidimensional and the seven-dimensional models (r = .98). Figure 18.4 shows the latent variances and the WLE variances of Model (8), which were estimated along with the 7D LLTM. As can be seen, the latent variances range from a small value (0.235 for "mental model") to relatively large values for "describing the phenomenon" (1.239) and "dealing with numbers" (1.769). In contrast, the WLE variances do not have this broad range, but vary from 0.858 ("mental model") to 1.869 ("dealing with numbers"). As a consequence, the reliabilities of the seven competencies also vary considerably Table 18.7 presents the reliability estimates, measured as the ratio of the latent variance to the WLE variance, for the first four models discussed in this chapter. At the top of the table the reliabilities of the unidimensional LLTM (second column) and of the Rasch model (third column) are presented. The lower portion of the table shows the reliabilities of the seven cognitive competencies for the seven-dimensional LLTM (second column) and the seven-dimensional Rasch model (third column). Some of the reliabilities are very low (e.g., for "mental models," seven-dimensional LLTM), which may be due to a floor effect, that is, the mental model tasks only have a mean solution probability of 0.17. Nevertheless, for most of the competencies, the reliabilities are considerably high, which confirms that a multidimensional approach for analyzing the data is more appropriate than a unidimensional approach. Whether each trait requires its own parameter or whether a smaller number of dimensions would suffice to describe the data are questions that have not been subjected to model fit tests but have been answered by an exploratory principal components analysis. The first two principal components explain 88.6 percent of the variance. This is a strong indicator that two dimensions might suffice. The structure and interpretation of this two-factor space is very similar to that of the next model; therefore, a separate presentation of the results is not provided here.

The next step of analysis is motivated by the question of whether the superiority of the seven-dimensional LLTM can be increased even more if the assumption of the main effect of content on the task difficulty is omitted. The resulting model is the multidimensional Rasch model without a decomposition of the task parameters. The model has 70 item parameters (instead of 10) and, as the seven-dimensional LLTM, 7 latent variances and 21 covariances. Then the total number of independent

Was this article helpful? |

## Post a comment