Tuesday, October 15, 2013

How Big Should My Sample Be?



One of the most common questions encountered by researchers is "how big does my sample need to be?" Unfortunately, there are no hard and fast rules for determining an ideal sample size, although there are some guidelines that can help researchers navigate this question.

Commonly, researchers will conduct a “power analysis” to investigate how many participants they would need to find  a relationship of a given expected size. Unless there is substantive prior research in their area of interest, it can be difficult to know how large an association between or among variables a researcher should expect. The size of the association -- or the effect size -- can vary from small (i.e. a subtle relationship, which is common with many psychological and social processes) to large. Since there is some guesswork as to the size of the effect of interest in the population, power analyses necessarily only yield estimates of the appropriate sample size. So, to the extent that a researcher overestimates the effect size in the population, he/she may obtain a sample that is too small to detect their effect of interest, and thus waste time and money. On the other hand, to the extent that a researcher underestimates the effect size in the population, he/she may obtain as sample that is excessively large to detect their effect of interest, and thus waste time and money. Despite these drawbacks, power analysis can be a helpful guide to point researchers in the right “ballpark” with respect to their sample size. To improve the usefulness of power analysis, researchers should incorporate effect sizes calculated from prior research into their estimate of effect size for their own research.

The size of the population will also guide sample size. When a researcher is dealing with a small population (say, 50 people), a 30 person sample would provide very generalizable results. However, when a researcher is dealing with a larger population (say, 5,000,000 people),  a 30 person sample may be too small for adequate generalizability. So, the larger the population, typically, the larger the sample a researcher might desire for the purposes of generalizability. However, the relationship between population size and sample size is not linear, so population size only provides a rough guideline for approximately what sample size might be desirable.

When determining sample size, a researcher must also consider their research design and the associated costs of that design. Interviews and qualitative research tend to be time- and monetarily-intensive. Consequently, researchers conducting qualitative research tend to work with smaller samples (30-100 participants), as larger samples can be prohibitively expensive and difficult to acquire. Laboratory experiments employing quantitative data can be somewhat time-consuming, as participants must schedule times to come into the lab, so laboratory experiments may generally employ mid-sized samples (100-250 participants). Finally, survey research is very inexpensive to implement and takes very little time. Consequently, large samples of over 500 participants are not impractical to obtain through survey methodology.

Another design consideration researchers need to wrestle with when determining sample size is whether they are conducting a “between subjects” or a “within subjects” design. In a “between subjects” design, only one datapoint for a given variable or relationship of interest is collected per participant. In a “within subjects” design, more than one datapoint for a given variable or relationship of interest is collected per participant. Because researchers collect more data points per person in a within- than in a between-subjects design, within-subjects designs require data to be collected from fewer participants overall relative to between-subjects designs.

In addition to population size, cost, and design, researchers must also consider their analytical plans when determining an appropriate sample size. The first question researchers must wrestle with is the complexity of the statistical models they intend to assess. If a researcher is testing a simple main-effects model, where only one outcome and one predictor is used, a relatively smaller sample is required. The more effects are added to a statistical model of interest (e.g., additional outcomes or predictors, interaction terms, etc.), the more participants will be needed to obtain estimates of each effect. So, the more complex the statistical model, the more participants should be sampled.

Finally, researchers must consider the statistical estimation technique they intend to employ when determining their sample size. Statistics are calculated using one of many possible estimation methods. Descriptive statistics, such as means and variances, are the least sophisticated, and require the lowest number of participants. Similarly, some estimation methods, such as Ordinary Least Squares, involve a relatively straightforward equation where unknowns are acquired in one step. These techniques are the basis of most common inferential statistics, including linear regression, correlation, and so on, which are most commonly accessed through the statistical software programs SPSS, SAS, and stata. According to the central limit theorem, estimates of means from samples from the same population approach a normal distribution for sample sizes of 30 or above. Since a normal distribution is assumed by most OLS parametric analyses, for analyses of simple models using OLS estimation procedures or descriptive statistics, smaller sample sizes might be sufficient.

In contrast, many modern estimation procedures use estimation methods that require the iterative maximization of a function. These estimation procedures, including maximum likelihood (ML) estimation and its variants, underlie advanced modeling techniques such as Structural Equation Modeling, Mixed Models, Generalized Estimating Equations, Item Response Theory, and Random Coefficient Modeling, which are typically run through the statistical software programs SPSS, SAS, stata, MPLUS, LISREL, AMOS, HLM, and so on. Iterative estimate procedures are sometimes called “large sample” procedures -- and for a reason. For some of these advanced models, parameter estimates will be biased if the sample size is below 100. So, for ML estimation techniques and other iterative estimation procedures, a researcher will need to collect samples of 100 participants, 250 participants, or in some cases, even more than 500 participants to estimate parameters of interest.
 
The issue of determining sample size is obviously not an easy one to resolve. Researchers must consider the size of their effect of interest, the size of their population, the costs of research, their research design, their statistical model, and the types of analyses they intend to use, when deciding how large of a sample they need to collect. Some of their answers to the above deliberations may provide contradicting advice for sample size. Ultimately, as with all research, the researcher needs to weigh all of these factors and come up with their “best answer” that minimizes cost and maximizes the likelihood that they will find a significant result if one indeed exists in the population.

No comments:

Post a Comment