Aggregating Data to Higher
Levels of Analysis
Are
you interested in looking at the relationships between variables at different
levels of analysis? Perhaps, you interested in measuring emergent social
phenomenon or want to justify aggregating individual data to a higher level of
analysis. In any case, the use of aggregation statistics (rwg’s and
intraclass correlation coeffecients, or ICCs) can be helpful and are commonly
employed in fields concerned with multi-level theories, such as organizational
and cross-cultural psychology. The following paragraphs will provide a brief
discussion concerning (1) the purpose and usefulness of aggregation statistics,
(2) the most widely used aggregation statistics and the interpretive information
they provide, and (3) specify a resource that will further explain how these
various aggregation statistics are calculated.
Purpose and
Usefulness
To
begin, aggregation statistics can be used to test assumptions inherent in the
definition of particular constructs of interest and allow researchers to: (1) bolster
claims that a particular variable does in fact reflect a particular construct, and
(2) justify the aggregation of data from a lower level to a higher level of
analysis. As an example, cultural values are often theoretically defined as a
set of beliefs, motivations, and norms that are highly shared by a set of
individuals comprising a particular cultural group; consequently, individuals
are either aware of or have internalized these particular values and can
identify them or exhibit them, respectively, on measures requisite to their
assessment. However, while these values are dependent on and exist within to
the minds of individuals, they cannot be altered or adjusted by singular
individuals and exhibit a force on individual behavior that is socially
expected due to their widespread sharedness among members of a cultural group.
Therefore, cultural values practically and theoretically exist at a higher
level of analysis separate from the level of the individual mind.
However,
despite the fact that cultural variables theoretically exist beyond the
individual level, measurement of cultural variables necessarily takes place at
the individual level, as researchers have to administer surveys or experiments
to singular people. Thus, there exists a common problem that arises in such
research: the divergence between the level of measurement and the level of
analysis. So how do we bridge this gap and make the claim that our
individual-level measurements can be aggregated to represent a higher level,
cultural construct? Aggregation statistics—rwg’s and ICCs—are the
tools that will allow us to do so. But what are these aggregation statistics and
how do we interpret them?
Rwg (within-group agreement)
Rwg
is a measure of within-group agreement and is calculated for each particular
group of interest; in other words, a sample that consists of five groups would
necessitate five separate Rwg’s, with some groups exhibiting high
within-group agreement and others low within-group agreement. Rwg is
calculated by comparing observed within-group variance to some expected
distribution of random variance. Typically, this expected distribution is
represented as a uniform (or rectangular) distribution (where each possible
response is equally likely as any other); however, other distributions are also
used for theoretical reasons, including resampling methods (Bliese, 2000). The
typical cut-off point for claiming within-group agreement used in most of the
methodological and experimental literature tends to be .70. Consequently,
groups that have an Rwg over this amount are considered to exhibit
within-group agreement on a particular variable of interest, justifying
aggregation of data to a higher level of analysis. For example, it would
justify the claim that Cultural Group A has a shared perception of
collectivism, allowing a researcher to aggregate individual-level scores into
an overall collectivism score for Cultural Group A as a whole. This may allow a
researcher to not only make the claim that Group A exhibits a shared cultural
value for some construct, but also test the effect that this shared value has
on other individual level behaviors, cognitions, etc.
ICC1
(non-independence)
ICC1,
an intraclass correlation coefficient, is typically used as a measure of
non-independence on a DV of interest; in other words, ICC1 allows researchers
to determine the degree to which variance on an outcome variable or DV of
interest is due to group membership. Specifically, it indicates the extent to
which group members are interchangeable. When used with a dependent variable, it
informs the researcher that there are group differences, or high between-group variability, on some
variable of interest. It should be noted that ICC1 values above .30 are
extremely rare, with values commonly ranging from .05 to .20, with a median of
.12 (Bliese, 2000). Thus, a non-zero ICC1 value is useful for determining
whether or not testing between-group differences (such as the differences
between groups on collectivism) is justified or not.
It
is also important to note that ICC1 can be an indicator of reliability when used
with an independent (rather than dependent) variable of interest.
ICC2
(reliability)
ICC2,
another intraclass correlation coefficient, is used as an indicator of the
reliability—or consistency—of group means. Due to the method of calculation,
ICC2 values are much higher than ICC1 values, typically above .70 if
reliability of the group mean is high.
Summary
In
conclusion, all three of the aggregation statistics mentioned above are
distinct and often mutually reinforcing, providing both different and necessary
information for justifying aggregation of individual level data to a higher
level. One may have high ICC2, or reliability, but low agreement if individuals
are proportionally consistent in their rating on a measure (one person
typically uses 1, 2, and 3, while another uses 5, 6, and 7) but do not exhibit ratings
that converge on a particular shared score. The reverse is also possible. In
all, both Rwg and ICC2 support the aggregation of individual scores
to a higher construct of interest, typically if both are high. While ICC1 is
not necessary for justifying aggregation per se, if ICC1 is low (despite high reliability
and high within-group agreement) it indicates low between-group variability.
This often nullifies the most common research questions, which commonly concern
differences between groups on a particular construct. Aggregation may therefore
become pointless from a practical standpoint, as lack of variability leads to
an inability to predict any meaningful differences between groups (this may not
be true depending on one’s question of interest, of course).
Finally, for the purposes of
calculating these various aggregation statistics and for a longer discussion concerning
their usefulness and application, please refer to the chapter “Within-Group
Agreement, Non-Independence, and Reliability: Implications for Data Aggregation
and Analysis” by Bliese (2000).
References
Bliese, P. D.
(2000). Within-Group Agreement, Non-Independence, and Reliability: Implications
for Data Aggregation and Analysis. In K. J. Klein & S. W. J. Kozlowski
(Eds.), Multi-level theory, research, and
methods in organizations (pp. 349-381). San Francisco: Jossey-Bass.
No comments:
Post a Comment