Confounding of Theories and Methods

Our multiple measures differ too little from one another and are drawn from a restricted universe of assessments. Indeed, some theories are supported not only by single measurement operations but also by single measures of constructs.

There is a volume of research on theories that documents correlations not among constructs, but among the measures that go with the theory. For example, most studies that test Hackman and Oldham's (1976, 1980) theory of work design use the scale that they published with the theory, the Job Diagnostic Survey Without multiple scales that measure constructs proposed by the theory, it is impossible to partition observed covariance into construct-rele-vant and scale-relevant variance. In essence, the scale becomes the construct (see also Idaszak, Bottom, & Drasgow, 1988; Idaszak & Drasgow, 1987).

As another example, consider Herzberg's Two-Factor Theory (Herzberg, Mausner, & Snyderman, 1959). Herzberg proposed that the factors that cause job satisfaction were different than the factors that cause dissatisfaction. He derived his theory based on field studies where he asked employees to list the attributes of their jobs that made them satisfied and, separately, those that made them dissatisfied. Employees consistently listed different sources of satisfaction and dissatisfaction. Herzberg and his colleagues concluded that job satisfaction and job dissatisfaction were in fact two independent factors and not opposite poles of one dimension.

The theory received much attention in the years that followed, but many researchers had trouble replicating Herzberg's results if they used any other research method. Eventually, a research on Two-Factor Theory put the matter to rest by pointing out that Herzberg's theory could only be replicated by using his original item set (Schneider & Locke, 1971). The entire theory rested on the survey/interview method used to collect the data. Using alternative methods caused the predictions of the theory to fail.2

The problem seems to be structural in social science. Responsibility lies partially with researchers, but also with editors, who correctly require that published works rely on validated scales. Referees try to ensure that published results are based on prevalidated scales with demonstrated construct validity so that results are not due to idiosyncrasies of the scale used. Unfortunately, this creates a situation for researchers who, with limited time and space on questionnaires, risk not being published if they use new or alternative measurement operations without extensive validation.

The issue is not the validity of many of our basic scales. Indeed, a large number of scales can be argued to have substantial construct validity. The issue is that even our validated measures contain substantial, albeit unknown, amounts of stable method and self-presentation variance that masquerades as construct variance. When no other scales are available or acceptable or the exigencies of publishing intervene, the researcher has little choice but to use the validated ones.

The argument for multiple operationalizations for constructs is consistent with the logical positivist philosophy of science that held sway in psychology from the 1930s to the 1950s. This approach appeared to assume that measures are equivalent to the construct they assess. However, to argue that this implies acceptance of single measures and methods of operationalizing a construct distorts the logical positivism philosophy. Bridg-man, one of the founders of logical positivism,

2The extensive debate among mood researchers outside I/O psychology about the "true" factor structure of mood reports also illustrates our point. Some researchers proposed that negative and positive moods are polar opposites (Russell, 1980), whereas others proposed that negative and positive moods were not opposites, but independent dimensions (Watson & Tellegen, 1985). Only because different researchers used different items to measure mood for several years was it possible to have a theoretical dispute, apparently informed by data, about something so fundamental. Once measurement error was corrected in the Positive and Negative Affect Schedule (PANAS; Watson & Tellegen, 1985), evidence for independence was diminished (Green, Goldman, & Salovey, 1993), and consensual structures begin to emerge (Tellegen, Watson, & Clark, 1999).

argued that operational definitions are without significance unless at least two methods are known of getting to the terminus. One could have been a dedicated positivist (e.g., Bridgman, 1927, 1945) and still not fallen into the trap of relying on single operations of a construct. Even assuming a concept is synonymous with a set of operations, any operation does not necessarily produce a concept. Defining a phenomenon by the operations that produced it has a specious precision because it is a description of a single isolated event (Bridgman, 1927, p. 248), not a construct.

Was this article helpful?

0 0

Post a comment