In the research literature on nonreactive or unobtrusive methods, questions of validity are frequently addressed implicitly but are only occasionally addressed explicitly (Campbell, 1957). Because of the fact that highly nonreactive research—particularly that of Types 4 and 5—is certainly more unconventional, sometimes considered primarily as "cute," spectacular, and (therefore?) less serious and influential, defensive argumentations by scientists favoring these methods are met. Part of the arguments that doubt nonreactive measures' validity refer to the fact that a substantial amount of Types 4 and 5 nonreactive research occurs in field settings with less control than in the laboratory. Compared with stud ies, for instance, on "(male) undergraduates in partial fulfillment of course requirements," representativeness at a first glance may be more likely given in the field. Because we as researchers are more adapted to the laboratory, however, field research is more under investigation concerning validity criteria. Maybe that because in laboratory research we are sure of not fulfilling criteria such as representativeness of participant and setting sample, we are in more danger of attesting this feature to field research. Visitors in an art gallery are simply gone after having been "under investigation." Were they representative? If yes (or no)—for which entity? Furthermore, because we are only occasionally interested in studying psychological variables specific for settings under investigation, we have to consider the questions of ecological validity (Brunswik, 1956) both inside and outside the laboratory. Nonreactive field research is not ecologically valid per se: Are we, for example, able to enhance our knowledge about human aggression by observing interactions in traffic, a football stadium, a school yard, a court proceeding, or an experiment with a highly efficient cover story? Where are the opportunities for and limitations of generalizability? Or is it more appropriate to select groups of situations in which behavioral variance may be explained instead of looking for broad generalizability? In the laboratory our awareness of external validity is directed toward the acceptability of withdrawing the context and reducing the naturally occurring complexity. This may be independent of the fact of whether the respective method is unobtrusive or not (see, e.g., Type 3 studies). In field research, more frequently met when highly unobtrusive methods are applied, the format of the question differs: May we transfer the results of research in complex, naturally occurring situations to other contexts, or do we ignore the specifics of situations when interpreting the findings? At this point the methodological interrelations, but at the same time independence between field versus laboratory setting and reactive versus nonreactive research, are evident.

Even though they try to avoid effects violating validity resulting from reactivity, nonreactive methods are not at all immune to threats. Do worn carpets, garbage, or archival data on absenteeism indeed indicate what researchers attribute to them (Schweigert, 1998)? It is easy to imagine the effect of change in the recording procedure on a time series of data. Imagine, for example, a change of criteria when recording norm violating behavior. At first glance, the occurrence of specific behavior may have changed whereas in fact only the categorization has been modified, for instance, resulting from an administrative act (for an example, see following text). The type of research discussed here must be particularly aware of these threats to validity. Occasionally, true predictors of variance are highly unobtrusive themselves: In analyzing time series on statutory rape, Linneweber (2000) identified a significant decline in violence in one region he investigated. At the same time, in other areas, the number of complaints decreased, however, only moderately. What made this region become less violent all of a sudden? A closer investigation revealed a simple cause: Because the opening hours of the police station had been reduced, the police were less able to respond as quickly to complaints from the victims of violent crime, and it is known that the willingness to engage declines with temporal distance to critical events such as observing violence.

In cross-cultural research, the danger of misinterpreting differences (or similarities) has been discussed extensively (Doucette-Gates, Brooks-Gunn, & Chase-Lansdale, 1998). Nonreactive researchers would be well advised to learn lessons from this area. Basically, the advantage of reducing objectionable effects resulting from reactivity is paid for by reduced controllability. With respect to validity, the respective trade-off must be calculated.

To estimate the risk of misinterpretation, it might be advisable to carefully assess the convergent and discriminant validity of each specific nonreactive measure one considers using. An example for such an assessment is the study mentioned earlier by Cialdini and Baumann (1981) that compared the results of a nonreactive measurement technique with a standard interview procedure. They found that the results corresponded considerably. Most interestingly, this correspondence was qualified by the finding that "when the responses were laden with social desirability, attitudes measured by the interview technique were skewed in the socially desirable direction relative to those measured by the littering technique" (p. 254).

