Time Is Either Ignored or Arbitrary

The fourth characteristic of general research in I/O psychology is that our data are usually cross-sectional and static. As in other areas of psychology, it is difficult to obtain longitudinal data sets. As a result, process theory about how variables should be causally related is relegated to introduction and discussion sections of papers, whereas the method and results describe cross-sectional data collection. Designs that rely on such static, between-person, variance in measures are useful for a substantial but not unlimited range of questions.

Relying on static data collections forces researchers to make three assumptions: (a) within-person variance is either uninteresting, error, or will not address our theoretical questions; (b) measurement operations used to assess static or aggregated measures are immune from influences due to respondents' current standing on constructs or this extraneous variance is small, random, and can be relegated to the error term; and (c) we know the intervals across which we should aggregate recall measures and participants can accurately aggregate.

The first point, that within-person variance is not interesting, is important. It can best be addressed with the very data that our research designs do not typically collect: within-person data. Closer examination of many theories will likely reveal that they would be more completely addressed by analyzing both within- and between-

person variance in their central constructs. For example, up to the early 1990s the field defined "affective reactions" to one's job as relatively static job satisfaction. Only recently has attention been focused on the idea that employees may not have stable levels of job satisfaction across time and that this dynamic variance is systematically related to important variables (Ilies & Judge, 2002; Weiss & Cropanzano, 1996).

The answer to question 2 is more difficult because even granting that dynamic variance may be random across individuals, it will not be randomly distributed across responses to the questionnaire. Suppose, for example, that commitment and job satisfaction are positively correlated over time. In periods when one is committed to the organization, one also has high levels of job satisfaction. If we measure both variables in only one time period and inspect only the between-persons correlation matrix, this within-persons correlation will inflate the size of the observed between-persons relationship. Brief and his colleagues demonstrated that state variables can influence responses on supposedly static instruments when they induced higher scores on a "static" job satisfaction instrument by elevating state mood with a gift of a cookie (Brief, Butcher, & Roberson, 1995).

Third, the appropriate interval across which to aggregate observations depends on an understanding of rate of change of our constructs and a well-articulated theory of organizational and individual time. Such theory would specify intervals across which stability can be expected and the relative amounts of change expected across other, longer, intervals. For example, stability of job attitudes will depend on economic, political, organizational, and psychological processes. How many times does the boss need to engage in harassment or how often does it need to occur before attitudes change? Is once enough or must it become a pattern spaced over time? How fast do people change their evaluation of their job? How general are the factors that cause change over time? What are the temporal characteristics of the feedback from behaviors onto the attitudes that precipitated the behaviors? Without theories that provide answers to such questions, we use what seems intuitively appropriate. We are operating at the intersection of organizational and psychological time, and we have little guidance. So, we slice into an organization at one time point and ignore trajectories of variables that may alter our observations. These are issues that have received little attention in the literature beyond a few comprehensive theories (e.g., Naylor, Pritchard, & Ilgen, 1980) and recent studies that document rates of change following, for example, organizational entry (Chan & Schmitt, 2000). The result of this oversight is temporal misspecification. The time intervals across which measures are aggregated or recalled are arbitrary; they are often dictated neither by theoretical requirements nor empirical data relevant to the appropriate length of time intervals in organizations or in the lives or organizational employees. Beyond a few examples, little thought is given to how fast or slow we might expect variables of interest to change or fluctuate across time. Applications of computational modeling in which rates of change are explicitly modeled based on differing sets of assumptions about underlying states and processes provide one avenue for studying temporal questions (Ilgen & Hulin, 2000).

The arbitrariness of time frame is evident from the wording of organizational surveys themselves. Subjects are often given an arbitrary time frame over which to integrate their experience for responding to our surveys (e.g., "in general how do you rate your . . ."). We have little good evidence about how individuals construct responses to such questions. Do they accurately recall their actual experiences, or do they use beliefs, implicit theories, stereotypes, and other heuristics to generate self-reports?

Evidence addressing this latter question exists for individuals making reports about how they feel or have felt. Studies that compare retrospective reports of affect (over the past few weeks) to actual reports taken during the same time period indicate that the two do not match (Thomas <Sr Diener, 1990). Individuals fail to recall accurately their own affect because they are overly influenced by a variety of factors, including beliefs about particular situations (Arntz, van Eyck, & Heijmans, 1990;

Chapter 30


Herbert W. Marsh, Andrew J. Martin, and Kit-Tai Hau

In this chapter we begin with a brief overview of the construct validity approach that underpins our multimethod perspective to self-concept research. After briefly reviewing the theoretical basis for our self-concept research, we provide an overview of the different multimethod approaches used in this research program. We have, somewhat arbitrarily, divided this into four sections. First we focus on a wide variety of applications of the multitrait-multi-method (MTMM) design, the traditional multi-method approach. Second, we briefly review some of our cross-cultural research where results from multiple countries are compared to evaluate the cross-cultural generalizability of our research. Third we describe some additional analytic approaches that fit within our broader perspective of the multi-method approach. Finally, we explore some broader perspectives on the multimethod approach.


Psychology focuses on hypothetical constructs— unobservable, theoretical abstractions—inferred indirectly on the basis of observable indicators of the construct. A critical issue is how well the observable indicators represent the hypothetical construct—the extent to which the theoretical construct is well represented by the test scores; well defined, related to variables and conditions to which it is theoretically and logically connected, and unrelated to variables and conditions to which it is not theoretically and logically connected. Hence, evidence used to evaluate construct validity includes the content of measures, response processes by participants, internal structure in terms of consistency and factor structure, convergent and discriminant relations with other constructs, criterion-related validity, and validity generalization to relevant and similar situations or populations. To the extent that there are multiple indicators of each construct it is typically possible to: evaluate each indicator; discard or replace ineffective ones and assign appropriate weights to the others; evaluate and correct for measurement error; evaluate the internal structure of the indicators; and test for systematic, nonrandom sources of bias (e.g., method effects).

In psychological research it is advisable to consider multiple outcome measures to test the construct validity of the outcome construct, rival hypotheses, and competing theories. For example, an intervention designed to enhance academic self-concept should have a stronger effect on academic self-concept than on physical self-concept. This provides a possible test of potential biases such as the Hawthorne effect, Halo effects, or postgroup euphoria effects. Multiple outcome measures allow for

We would like to dedicate this chapter to D. Campbell and D. Fiske, who pioneered the multimethod approach with their development of multi-trait-multimethod methodology that has been so central in our research. Our respect for their work and its influence on our research is shown in that Herbert W. Marsh is the person who has cited their classic work the most. We would also like to thank our many colleagues who have contributed to our self-concept research program. K.-T. Hau pursued this research, in part, while a Visiting Scholar at the SELF Research Centre (University of Western Sydney). The research was funded in part by grants from the Australian Research Council.

tests of unintended outcomes (positive and negative). For example, interventions that enhance skills but lead to more negative self-concepts are likely to have very different implications to a program that increases both skill levels and the corresponding area of self-concept (see Marsh & Peart, 1988).

Particularly in nonexperimental research with variables that are not or cannot be experimentally manipulated, it is often desirable to have multiple indicators of the independent or mediating variables. Even in experimental and quasi-experimental studies, it is advisable to have multiple operational-izations of the experimentally manipulated variable. Thus, for example, Marsh and Peart (1988) compared competitive and cooperative interventions designed to enhance physical fitness. Although both interventions enhanced fitness, the cooperative intervention also enhanced physical self-concept, whereas the competitive intervention led to the reduction in physical self-concept relative both to pretest scores and to scores for a randomly assigned no-treatment control group. They argued that the short-term gains in physical fitness were likely to be undermined by declines in physical self-concept associated with the competitive intervention. Hence, construct validation is relevant to experimental as well as nonexperimental research.

It is also valuable to test the same hypothesis with different research methodologies. For example, the limitations and threats to the validity of interpretations are quite different in experimental, correlational, survey, action research, interview, and case study approaches. To the extent that there is a convergence in results from different research methodologies and samples, the construct validity of the interpretations is enhanced. Rather than argue about the relative merits of alternative methodologies, it makes more sense to recognize that no one methodological approach is inherently superior.

In conclusion, the critical ingredient underlying this cursory discussion of construct validity is the emphasis on multiple perspectives based on multiple methods. Good research involves the use of: multiple indicators of each construct, multiple constructs and tests of their a priori relations, multiple outcome measures, multiple independent/manipulated variables, multiple methodological approaches, and mul tiple researchers with different methodological perspectives. In each case, the multiple perspectives provide a foundation for evaluating construct validity based on appropriate patterns of convergence and divergence and for refining measurement instruments, hypotheses, theory, and research agendas.


In their classic review of self-concept research, theory, and measurement, Shavelson, Hubner, and Stanton (1976) developed an influential multidimensional, hierarchical model of self-concept. Rather than emphasizing the shortcomings of existing self-concept research, Shavelson et al. contended that "our approach is constructive in that we (a) develop a definition of self-concept from existing definitions, (b) review some steps in validating a construct interpretation of a test score, and (c) apply these steps in examining five popularly used self-concept instruments" (p. 470). An ideal construct definition, they emphasized, should consist of the nomological network containing within-network and between-network components. The within-network portion pertains to specific features of the construct—its components, structure, and attributes and theoretical statements relating these features. Within-network studies test, for example, the dimensionality of self-concept to show that the construct has consistent, distinct multidimensional components (e.g., physical, social, and academic self-concept) using empirical techniques such as factor analysis or MTMM analysis. The between-network portion of the definition locates the construct in a broader conceptual space, establishing a logical, theoretically consistent pattern of relations between measures of self-concept and other constructs. Hence, as early as 1976, self-concept was developed along lines demanding multimethod approaches to support its validity.

Factor analysis played a contentious role in early self-concept research. Historically, most evaluations of the dimensionality self-concept measures were exploratory factor analyses (e.g., see Marsh & Richards, 1988; also see Shavelson et al., 1976;

Wylie, 1989) intended to "discover" the underlying factors based on responses to large pools of items that were not derived from an explicit theoretical model. Because of a combination of poorly designed instruments and reliance on exploratory factor analyses, items typically loaded on multiple factors and observed factors were ambiguous in relation to a priori factors and not replicable in subsequent studies. Marsh and Hocevar (1985) provided one ol the early applications of confirmatory factor analysis (CFA) to evaluate first- and higher-order factor self-concept structures in relation to responses to an instrument specifically constructed to test theoretical predictions from the Shavelson et al. (1976) model. The use of multiple indicators to measure a latent construct through the application of CFA and other appropriate statistical analyses is a standard starting point in a multimethod approach to construct validation.

Consistent with this construct validity perspective, Marsh (1993a; Marsh, Craven, & Debus, 1998) argued that theory, measurement, and empirical research are inexorably intertwined so that the neglect of one will undermine the others. From this perspective, Shavelson et al. (1976) provided a theoretical blueprint for constructing self-concept instruments, designing within-network studies of the proposed structure of self-concept, testing between-network hypotheses about relations with other constructs, and eventually rejecting and revising the original theory (Marsh & Hattie, 1996). This chapter examines a number of methods that have been pivotal to our evolving self-concept research program specifically and to the development of this construct as one of the most important constructs in educational psychology. We show— through presentation of multimethods in self-concept research—that multimethod research offers enormous advantages to the researcher that has the potential to substantially enhance the validity of findings within any research program.

Particularly in the last decade, there have been substantial advances in the methodological sophistication of self-concept research that have been stimulated in part by the development of stronger, multidimensional self-concept instruments. Here we briefly summarize some of the methodological approaches that have been particularly effective in answering some of the "big" questions emanating from our research program. Although presented in the context of self-concept research, the issues, challenges, and multimethod solutions should have broad applicability. We also emphasize that new and possibly more appropriate methodological approaches to many of these substantive issues are still evolving as is made clear from the wealth of material included in this book.


The MTMM design is the essence of multimethod research. It has been used widely in self-concept research to provide evidence of convergent and discriminant validity and is one of the criteria on which self-concept instruments are routinely evaluated (e.g., Byrne, 1996; Marsh & Hattie, 1996; Wylie, 1989). In the development of the MTMM approach, Campbell and Fiske (1959) advocated the assessment of construct validity by measuring multiple traits (Tl, T2, etc.) with multiple methods (Ml, M2, etc.). In self-concept research, the multiple traits typically represent multiple dimensions of self-concept. The term multiple methods was used very broadly by Campbell and Fiske to refer to multiple tests or instruments, multiple methods of assessment, multiple raters, or multiple occasions. Whereas the analytic procedures for evaluating MTMM data are appropriate for different types of multiple methods, the substantive interpretations differ depending on the nature of the multiple methods. Campbell and Fiske's paradigm is, perhaps, the most widely used construct validation design. Although their original guidelines are still widely used to evaluate MTMM data, important problems with their guidelines are well known (see reviews by Marsh, 1989, 1993b; Marsh & Grayson, 1995). More recently, researchers have used CFA approaches to evaluate MTMM data in relation to a prescribed taxonomy of MTMM models specifically designed to evaluate different aspects of convergent and discriminant validity (Marsh, 1989; Marsh & Grayson, 1995; Widaman, 1985). In this section, we begin with an overview of the CFA approach to MTMM data, describe some traditional applications of MTMM studies in self-concept research, and then explore some extensions to the logic of MTMM design and analyses to demonstrate its flexibility.

Was this article helpful?

0 0

Post a comment