Friday, November 22, 2013

Your Handy Guide to Fit Statistics in SEM

We've all been there... you've just completed your analyses and now have your output, representing not only your hopes and dreams of academic stardom, but also weeks of entering data, cleaning data, creating datasets, drawing models, scrapping models, reformatting datasets, crying to your adviser, writing syntax, cursing the stats gods, rewriting syntax... and so on.  You look down at your beautiful, hard earned output with a tear in your eye and think "What does it mean??".

Well not to worry! DASAL is here to help (at least the top portion of your output) with our handy guide to model fit statistics!  

Model fit refers to how well the relations in you data are represented by a particular model (or, to be more "statistic-y": how well all observed covariance and variance in the data are explained by an entire model - including errors), and can be inferred from the model fit statistics generated by your statistical package of choice.  Although fit statistics have come under fire in recent years (Barrett, 2007), understanding and accurately reporting them is a crucial task for the diligent researcher.  The following are a list of the most common fit statistics and their cutoff criteria for "good model fit".

Absolute Fit Indices
These indices provide a measure of how well an a priori model fits your sample data (McDonald & Ho, 2002) by comparing the observed and model-implied variance and covariance matrices. It can also be used to determine the best fitting model among many.  In general, these indices improve as more parameters are added to the model.  Good fit indicates that most of the (co)variance was explained by the model, or that there wasn't much to explain!

Chi-Squared
This is the most traditional measure of model fit and "asses the magnitude of discrepancy between the sample and the fitted covariance matrices" (Hu & Bentler, 1999).  Chi-square is very sensitive to sample size, and will generally indicate bad fit for larger samples.

Smaller values, relative to the degrees of freedom, indicate better fit.  A nonsignificant chi-square value indicates good model fit.

Goodness-of-Fit Index (GFI)
This is designed to be an R square type index.

Larger values indicate better fit. In general, values above .90 indicate good fit.

Standardized Root Mean Squared Residual (SRMR)

Smaller values indicate better fit. In general, values below .08 indicate good fit.

Parsimonious Fit Indices
These indices evaluate the overal discrepancy between the observed and model-implied variance and covariance matrices, but also take into account the simplicity of the model.  In general, parsimonious indices improve as more parameters with useful contributions are added to the model. Good fit indicates that a good amount of the (co)variance was explained by the model, relative to its parsimony.  This does not mean that the amount explained is good overall - for that information, you must look to absolute fit indices.

Akaike Information Criterion (AIC)
Values generated are relative, and AIC is often used for model comparison.

Smaller values indicate better fit.

Root Mean Squared Error of Approximation (RMSEA)
These fit indices are presented with a confidence interval, usually a 90% CI.

Smaller values indicate better fit. In general, values below .06 indicate good fit.

Adjusted Goodness-of-Fit (AGFI)

Larger values indicate better fit. In general, values above .90 indicate good fit.

Incremental Fit Indices
These indices compare the model's absolute or parsimonious fit with that of a baseline model (usually the null model).  In general, larger values indicate better fit.  Good fit in this sense means that much of the (co)variance was explained by the tested model in comparison to the amount that would have been explained by the null model.

They include:

Comparative Fit Index (CFI)
In general, values above .95 indicate good fit.

Normed Fit Index (NFI)
In general, values above .90 indicate good fit.

Nonnormed Fit Index (NNFI)
In general, values above .95 indicate good fit.

*********
SO NOW WHAT?? We have so many measures, which ones do we choose?  People generally look to the chi-squared first.  However, in cases where sample size is small, or when the chi-square is borderline, Hu and Bentler (1999) have proposed some joint criteria as follows:

A model has good model fit if:

NNFI and CFI are greater than or equal to .96 AND SRMR is less than or equal to .09.

or

SRMR is less than or equal to .09 AND RMSEA is less than or equal to .06.

But how do I interpret it?

If you have a good fit to your data, you may not say that you have confirmed your underlying theory as true, but rather that you fail to reject the proposed model as one viable representation of the true relations underlying the data.







Tuesday, November 19, 2013

General monotone model for data analysis

Summary

Summary

Use gemmR! General Monotone Model in R, available on github.

Background

Empirical data in the social sciences are rarely well-behaved. We often collect data that are skewed, have changing variance over the observed range, or related to other observed variables non-linearly. Generally, the advice for dealing with these problems involves transforming data, dropping data, or ignoring violations of assumptions. In the former cases, transformation and outlier deletion are justified by claiming that parametric statistics require well-behaved data that conform to their assumptions. In the latter case, we are required to believe that parametric linear models are robust to violations of assumptions. At a minimum, one of these assertions must be wrong.

More than our data conform poorly to our tests. The hypotheses we test tend to be underspecified relative to the results of parametric statistics. When we examine the effect of an intervention, for example, we assume that the itervened-upon objects will respond with higher or lower measurements on average, relative to some unaltered group. In general, we predict the order of the means but not the distance between them, (there are some exceptions to this, but they are rare in experimental social science research). We then proceed to test our hypotheses by using data to decide on the expected difference between on our groups on some outcome. Only then do we reduce the level of our inference back to a direction, (one group does more or less of a thing), relevant to our original research question.

Order statistics and non-parametric approximations

For simple cases, order statistics already exist. Kendall's \( \tau \) is the ordinal equivalent of the Pearson correlation, while the Mann-Whitney \( U \) and Kruskal-Wallis test correspond to Student's \( t \)-test an the one-way analysis of variance, respectively. With multiple predictors, however, parametric approximations are required. These approximations tend to estimate additional parameters to govern the non-linearities in the relationship between our predictors and outcome variable, (an example is GAM). Especially for sparse data, this can be intractable.

Dougherty and Thomas (2012) proposed the General Monotone Model (GeMM) as a solution for this problem. GeMM is a genetic search-based, all-subsets ordinal regression algorithm that maximizes the rank correspondence between a model and criterion variable. GeMM basically follows three steps:

  1. Produce a random set of regression weights.
  2. Calculate a penalized \( \tau \) between those predictors and the criterion.
  3. Repeat many times, select the set of weights than provide the best ordinal fit.

The result is a set of coefficients that maximize ordinal fit between some set of predictors and an outcome. Weights of zero indicate that a given predictor does not explain sufficient paired-comparisons at the model level, (essentially an ordinal equivalent to squared error), to offset the penalty for including that predictor. Notably, this procedure applies whether you thought the linear model was robust to assumption violations or you intended to transform your data to meet those assumptions. With GeMM, there are fewer decision to make: the results are theoretically invariant to monotonic transformation. This is a relatively complex set of calculations to program from scratch, however, and the only previous implementation of GeMM required extensive familiarization with a MATLAB script. Ideally, GeMM would be easily implemented by anyone without extensive code interpretation.

Introducing gemmR

This lead us to develop gemmR, a GeMM library for the R statistical language. gemmR improves on the previously-available code in a number of ways:

  • gemmR is simple.
    • Fewer assumptions mean there are fewer diagnostic and cleanup operations to perform. Non-normal data and outliers are not problems to consider when using GeMM.
  • gemmR is standardized.
    • Calls to gemm use the existing R framework for specifying models. You no longer have to reshape your formatted data in order to run GeMM. If you can run a linear model in R, you can run GeMM.
  • gemmR is fast.
    • Searching through candidate regression coefficients using Kendall's \( \tau \) is a computationally-expensive process and necessarily takes some time. Our R package uses a faster implementation of \( \tau \) calculation and calls these functions in C++ to speed this process. We also rely on R's existing parallelization structure for the potential to go even faster.

You can find gemmR and instructions for installation on github. A more thorough vignette and set of examples are currently in the works, but the package maintainer would be happy to deal with any questions you might have.

Saturday, November 9, 2013

Building on Beta: Indicators of Predictor Importance



Researchers are rarely interested in knowing only whether a variable exhibits a significant relationship with another variable. The more interesting research questions, and answers, are often about the relative importance of a variable in predicting an outcome. Indicators of relative importance may rank order predictors’ contributions to an overall regression effect or partition the variance explained by individual predictors into unique and shared variance. Relative importance of predictors is important for theoretical development because it encourages parsimony, but also for practicality as researchers and practitioners are faced with limited resources.

A brief review of psychology publications that report multiple regression (MR) analyses would show that beta coefficients are clearly the most popular indicator of relative importance reported and interpreted. However, an exclusive reliance on beta coefficients to interpret MR results, and predictor importance in particular, is often misguided and we will briefly review additional indicators that can provide useful information.

Beta Coefficients
In multiple regression, beta coefficients (i.e., standardized regression coefficients) represent the expected change in the dependent variable (in standard deviation units) per standard deviation increase in the independent variable, holding all other independent variables constant. When independent variables are perfectly uncorrelated, one can estimate variable importance by squaring the beta weights. However, this becomes problematic when independent variables are correlated, as is often the case. In these situations, a given beta weight can reflect the explained variance it shares with other variables in the model, and it becomes difficult to disentangle a variable’s unique importance from the “extra credit” it gets from shared variance. As such, beta weights should be relied on as an easily computed but preliminary indicator of a predictor’s contribution to a regression coefficient.

Pratt Product Measure
The Pratt product measure is a relatively simple method of partitioning a model’s explained variance (R2) into non-overlapping segments. The Pratt measure multiplies a variable’s zero-order correlation with a dependent variable by its beta weight. The correlation measure captures the direct effect of the predictor in isolation from other predictors while the beta weight is an indicator of a predictor’s contribution accounting for all other predictors in the model.  If a given beta weight is inflated by shared variance, multiplying the value by its corresponding (smaller) zero order correlation will correct for the beta’s inflation. Summing the Pratt product measures for all the variables in a model generally results in the R2 except in some cases where a negative product is yielded (often a signal of suppression), allowing for straightforward partitioning and ranking of variables.

Commonality Coefficients
Commonality analysis partitions R2 into variance that is unique to each independent variable (unique effects; e.g., X1…X3) and variance that is shared by all possible subsets of predictors (common effects; e.g., X1X2, X2X3, etc.). Partitioning variance in this way produces non-overlapping values that can be compared easily. Common effects, in particular, provide a great deal of information about the overlap between independent variables and how this overlap contributes to predicting the dependent variable. However, common effects can be difficult to interpret, especially as the number of variables increases in a model and commonalities reflect the combination of more than two variables.

Relative Weights
Relative weight analysis (RWA) is a slightly more complex method of partitioning R2 between independent variables. When predictors in a MR are correlated, RWA uses principal components analysis to generate principal components that are the most highly correlated with the dependent variable while being uncorrelated with one another. The dependent variable is regressed on these components in one analysis and a second analysis is conducted in which the original independent variables are regressed on the components. A given relative weight is equal to the product of the squared regression coefficient from the first analysis and the squared regression coefficient from the second analysis. Dividing relative weights by R2 then allows for ranking of individual predictors’ contributions. Essentially, relative weights are an indicator of a predictor’s contribution to explaining the dependent variable as a joint function of how highly related independent variables are to the principal components and how highly related the principal components are to the dependent variable. As such, RWA partitions R2 while minimizing the influence of multicollinearity among predictors, a major strength of the method.

Dominance Analysis
Dominance analysis compares the unique variance contributed by pairs of independent variables across all possible subsets of predictors. To assess complete dominance, the unique effect of a given independent variable when entered last in multiple regression equation is compared to another independent variable across all subsets of a model. Complete dominance occurs when the unique effect is greater across all the models. Conditional dominance is determined by calculating the averages of all independent variables’ contributions to all possible model subsets and comparing those averages. Conditional dominance is shown when an independent variable contributes more to predicting the dependent variable, on average across all possible models, compared to another independent variable.  

References
Azen, R., & Budescu, D. V. (2003). The dominance analysis approach for comparing predictors in multiple regression. Psychological Methods, 8, 129-148.

Johnson, J. W. (2000). A heuristic method for estimating the relative weight of predictor variables in multiple regression. Multivariate Behavioral Research, 35, 1-19. 

Nimon, K. F. & Oswald, F. L. (In Press). Understanding the results of multiple linear regression: Beyond standardized regression coefficients. Organizational Research Methods.  

Nimon, K. F., & Reio, T. (2011). Regression commonality analysis: A technique for quantitative theory building. Human Resource Development Review, 10, 329-340. 
 


Friday, November 1, 2013

Come see DASAL at:

 We are pleased to announce that we will be presenting a poster at the first annual BRIDGES!

Are you interested in GEE and MLM?  Are you confused about when to use GEE and MLM?  Are you confused about what GEE and MLM refer to?  Come check out our poster and find out!

Generalized estimating equations (GEE) and multilevel models (MLM) are two statistical techniques often used to model complex data sets, such as longitudinal or nested data.  There has been some controversy in recent years about the utility of each and when to best use one versus the other. To help, we've disentangled these confusing acronyms in clearly laid out informational poster!

GEE and MLM: Which acronym is a better fit for my data?
Many types of data collection generate observation that violate the assumption of independence required by classical statistical models, including, for example the collection of data from individuals working in the same team or the collection of data from the same individual over time. These violations can be handled in a variety of ways, but two popular methods are Generalized Estimating Equations (GEE) and Multilevel modeling (MLM). We discuss the relationship between these modeling approaches and discuss situations in which these methods might be more or less appropriate.

The poster session is from 5:30 - 7 in Riggs Alumni Center, and there will be drinks and munchables!  More information can be found at: www.bridges.umd.edu.

We hope to see you there!