One of the most common questions encountered by researchers is "how big does my sample need to be?" Unfortunately,
there are no hard and fast rules for determining an ideal sample size, although there
are some guidelines that can help researchers navigate this question.
Commonly,
researchers will conduct a “power analysis” to investigate how many
participants they would need to find a
relationship of a given expected size. Unless there is substantive prior
research in their area of interest, it can be difficult to know how large an
association between or among variables a researcher should expect. The size of the association -- or the effect size -- can vary from small (i.e. a subtle relationship,
which is common with many psychological and social processes) to large. Since
there is some guesswork as to the size of the effect of interest in the
population, power analyses necessarily only yield estimates of
the appropriate sample size. So, to the extent that a researcher overestimates
the effect size in the population, he/she may obtain a sample that is too small
to detect their effect of interest, and thus waste time and money. On the other
hand, to the extent that a researcher underestimates the effect size in the
population, he/she may obtain as sample that is excessively large to detect
their effect of interest, and thus waste time and money. Despite these
drawbacks, power analysis can be a helpful guide to point researchers in the
right “ballpark” with respect to their sample size. To improve the usefulness of
power analysis, researchers should incorporate effect sizes calculated from
prior research into their estimate of effect size for their own research.
The size of the population will also guide sample size. When a researcher
is dealing with a small population (say, 50 people), a 30 person sample would
provide very generalizable results. However, when a researcher is
dealing with a larger population (say, 5,000,000 people), a 30 person sample may be too small for adequate generalizability. So, the larger the population, typically, the larger the
sample a researcher might desire for the purposes of generalizability. However, the relationship between population size and sample size is not linear, so population size only provides a rough guideline for approximately what sample size might be desirable.
When
determining sample size, a researcher must also consider
their research design and the associated costs of that design. Interviews and qualitative
research tend to be time- and monetarily-intensive. Consequently, researchers
conducting qualitative research tend to work with smaller samples (30-100 participants), as
larger samples can be prohibitively expensive and difficult to acquire. Laboratory
experiments employing quantitative data can be somewhat time-consuming, as
participants must schedule times to come into the lab, so laboratory
experiments may generally employ mid-sized samples (100-250 participants). Finally,
survey research is very inexpensive to implement and takes very little time.
Consequently, large samples of over 500 participants are not impractical to
obtain through survey methodology.
Another
design consideration researchers need to wrestle with when determining sample
size is whether they are conducting a “between subjects” or a “within subjects”
design. In a “between subjects” design, only one datapoint for a given variable
or relationship of interest is collected per participant. In a “within subjects”
design, more than one datapoint for a given variable or relationship of
interest is collected per participant. Because researchers
collect more data points per person in a within- than in a between-subjects design,
within-subjects designs require data to be collected from fewer participants overall relative to
between-subjects designs.
In
addition to population size, cost, and design, researchers must also consider
their analytical plans when determining an appropriate sample size. The first
question researchers must wrestle with is the complexity of the statistical models they intend to assess. If a
researcher is testing a simple main-effects model, where only one
outcome and one predictor is used, a relatively smaller sample is required. The
more effects are added to a statistical model of interest (e.g., additional
outcomes or predictors, interaction terms, etc.), the more participants will be
needed to obtain estimates of each effect. So, the more complex the statistical
model, the more participants should be sampled.
Finally,
researchers must consider the statistical estimation technique they intend to
employ when determining their sample size. Statistics are calculated using one of many possible estimation
methods. Descriptive statistics, such as means and variances, are the least
sophisticated, and require the lowest number of participants. Similarly, some
estimation methods, such as Ordinary Least Squares, involve a relatively
straightforward equation where unknowns are acquired in one step. These
techniques are the basis of most common inferential statistics, including
linear regression, correlation, and so on, which are most commonly accessed
through the statistical software programs SPSS, SAS, and stata. According to
the central limit theorem, estimates of means from samples from the same
population approach a normal distribution for sample sizes of 30 or above. Since a
normal distribution is assumed by most OLS parametric
analyses, for analyses of simple models using OLS estimation
procedures or descriptive statistics, smaller sample sizes might be sufficient.
In
contrast, many modern estimation procedures use estimation methods that require
the iterative maximization of a function. These estimation procedures,
including maximum likelihood (ML) estimation and its variants, underlie
advanced modeling techniques such as Structural Equation Modeling, Mixed
Models, Generalized Estimating Equations, Item Response Theory, and Random
Coefficient Modeling, which are typically run through the statistical software
programs SPSS, SAS, stata, MPLUS, LISREL, AMOS, HLM, and so on. Iterative
estimate procedures are sometimes called “large sample” procedures -- and for a
reason. For some of these advanced models, parameter estimates will be biased if the sample size is below 100. So, for ML estimation techniques
and other iterative estimation procedures, a researcher will need to collect
samples of 100 participants, 250 participants, or in some cases, even more than
500 participants to estimate parameters of interest.
The
issue of determining sample size is obviously not an easy one to resolve. Researchers
must consider the size of their effect of interest, the size of their
population, the costs of research, their research design, their statistical
model, and the types of analyses they intend to use, when deciding how large of
a sample they need to collect. Some of their answers to the above deliberations
may provide contradicting advice for sample size. Ultimately, as with all
research, the researcher needs to weigh all of these factors and come up with
their “best answer” that minimizes cost and maximizes the likelihood that they
will find a significant result if one indeed exists in the population.