flowchart LR A(Predictor) -->|path C| B(Outcome)
Check structural relationships
Mediation
With mediation analyses, we want to check to what extent an predictor-outcome association is explained by their associations with a third variable, the ‘mediator’.
Path C refers to the association between the predictor and the outcome. This is the total effect.
In a mediation, we want to check whether the mediator explains this total effect. Specifically, we aim to calculate the indirect effect (i.e., effect through the mediator between predictor and outcome) and the direct effect (i.e., what remains of the total effect after controlling for the indirect effect). The indirect effect is the multiplication of path A and path B. The direct effect is path C*.
flowchart LR D(Predictor) -->|path A| E(Mediator) D(Predictor) -->|path C*| F(Outcome) E(Mediator) -->|path B| F(Outcome)
The lavaan
package allows us to extend this mediation analyses for multiple predictors, mediators and outcomes. In the current example, we want to check to what extent our condition effect (a dummy code for the conditions no choice versus choice) on the outcomes pleasure and interest, vitality, and intended persistence is explained by the mediators autonomy satisfaction and competence satisfaction.
flowchart LR A(Condition) --> |B1| B(Autonomy satisfaction) A(Condition) --> |B2| C(Competence satisfaction) A(Condition) --> |C1| D(Pleasure and interest) A(Condition) --> |C2| E(Vitality) A(Condition) --> |C3| F(Intended persistence) B(Autonomy satisfaction) --> |A1| D(Pleasure and interest) B(Autonomy satisfaction) --> |A3| E(Vitality) B(Autonomy satisfaction) --> |A5| F(Intended persistence) C(Competence satisfaction) --> |A2| D(Pleasure and interest) C(Competence satisfaction) --> |A4| E(Vitality) C(Competence satisfaction) --> |A6| F(Intended persistence)
In describing the model, we need to provide a unique label to each pathway, so we can use this in the calculation of the indirect and total effects.
library(lavaan)
<- "
SEM # predictor -> mediators (paths B)
Autonomy ~ B1*Condition.d
Competence ~ B2*Condition.d
# predictor + mediator --> outcome (paths A and C*)
Pleasure ~ A1*Autonomy + A2*Competence + C1*Condition.d
Vitality ~ A3*Autonomy + A4*Competence + C2*Condition.d
Intended_persistence ~ A5*Autonomy + A6*Competence + C3*Condition.d
# calculation of an indirect effect
B1A1 := B1*A1
B1A2 := B1*A3
B1A3 := B1*A5
B2A2 := B2*A2
B2A4 := B2*A4
B2A6 := B2*A6
# calculating total effect
totB1A1 := C1 + (B1*A1)
totB1A2 := C1 + (B1*A3)
totB1A3 := C1 + (B1*A5)
totB2A2 := C1 + (B2*A2)
totB2A4 := C1 + (B2*A4)
totB2A6 := C1 + (B2*A6)
"
<- sem(model = SEM, data = data)
fit
summary(fit,
fit.measures = TRUE,
standardize = TRUE,
rsquare = TRUE)
# useful when fit of model is not good:
modificationIndices(fit, sort.=TRUE, minimum.value=3)
summary(fit,
fit.measures = TRUE,
standardize = TRUE,
rsquare = TRUE)
lavaan 0.6.17 ended normally after 13 iterations
Estimator ML
Optimization method NLMINB
Number of model parameters 19
Used Total
Number of observations 99 104
Model Test User Model:
Test statistic 99.060
Degrees of freedom 1
P-value (Chi-square) 0.000
Model Test Baseline Model:
Test statistic 504.532
Degrees of freedom 15
P-value 0.000
User Model versus Baseline Model:
Comparative Fit Index (CFI) 0.800
Tucker-Lewis Index (TLI) -2.005
Loglikelihood and Information Criteria:
Loglikelihood user model (H0) -581.632
Loglikelihood unrestricted model (H1) -532.102
Akaike (AIC) 1201.264
Bayesian (BIC) 1250.572
Sample-size adjusted Bayesian (SABIC) 1190.569
Root Mean Square Error of Approximation:
RMSEA 0.995
90 Percent confidence interval - lower 0.835
90 Percent confidence interval - upper 1.166
P-value H_0: RMSEA <= 0.050 0.000
P-value H_0: RMSEA >= 0.080 1.000
Standardized Root Mean Square Residual:
SRMR 0.178
Parameter Estimates:
Standard errors Standard
Information Expected
Information saturated (h1) model Structured
Regressions:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
Autonomy ~
Conditn.d (B1) 1.193 0.160 7.432 0.000 1.193 0.598
Competence ~
Conditn.d (B2) 0.839 0.183 4.584 0.000 0.839 0.418
Pleasure ~
Autonomy (A1) 0.022 0.090 0.245 0.806 0.022 0.015
Competenc (A2) 0.897 0.079 11.409 0.000 0.897 0.615
Conditn.d (C1) 1.186 0.190 6.228 0.000 1.186 0.405
Vitality ~
Autonomy (A3) 0.110 0.095 1.154 0.248 0.110 0.082
Competenc (A4) 0.871 0.083 10.435 0.000 0.871 0.651
Conditn.d (C2) 0.673 0.202 3.329 0.001 0.673 0.251
Intended_persistence ~
Autonomy (A5) 0.158 0.110 1.433 0.152 0.158 0.142
Competenc (A6) 0.438 0.096 4.539 0.000 0.438 0.396
Conditn.d (C3) 0.518 0.234 2.218 0.027 0.518 0.234
Covariances:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.Pleasure ~~
.Vitality 0.185 0.057 3.248 0.001 0.185 0.345
.Intndd_prsstnc 0.210 0.066 3.197 0.001 0.210 0.339
.Vitality ~~
.Intndd_prsstnc 0.155 0.068 2.290 0.022 0.155 0.237
Variances:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.Autonomy 0.634 0.090 7.036 0.000 0.634 0.642
.Competence 0.825 0.117 7.036 0.000 0.825 0.825
.Pleasure 0.505 0.072 7.036 0.000 0.505 0.237
.Vitality 0.569 0.081 7.036 0.000 0.569 0.318
.Intndd_prsstnc 0.759 0.108 7.036 0.000 0.759 0.623
R-Square:
Estimate
Autonomy 0.358
Competence 0.175
Pleasure 0.763
Vitality 0.682
Intndd_prsstnc 0.377
Defined Parameters:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
B1A1 0.026 0.107 0.245 0.806 0.026 0.009
B1A2 0.131 0.115 1.141 0.254 0.131 0.049
B1A3 0.188 0.134 1.407 0.160 0.188 0.085
B2A2 0.752 0.177 4.254 0.000 0.752 0.257
B2A4 0.731 0.174 4.197 0.000 0.731 0.273
B2A6 0.367 0.114 3.226 0.001 0.367 0.166
totB1A1 1.212 0.158 7.691 0.000 1.212 0.414
totB1A2 1.317 0.203 6.498 0.000 1.317 0.454
totB1A3 1.374 0.211 6.507 0.000 1.374 0.490
totB2A2 1.939 0.243 7.991 0.000 1.939 0.663
totB2A4 1.917 0.252 7.613 0.000 1.917 0.678
totB2A6 1.553 0.214 7.274 0.000 1.553 0.571
fitMeasures(fit, c("chisq", "df", "cfi", "rmsea", "srmr"))
chisq df cfi rmsea srmr
99.060 1.000 0.800 0.995 0.178
The output shows that:
The model has an acceptable fit
Pathways between predictor and mediator are significant. Also, mediators are significantly related to outcomes.
All total effects were significant, with direct effects still being significant and 3 indirect effects being significant (through the mediator Competence). Here, we can talk about a partial mediation.
A useful online tool to visualize structural equation model is https://app.diagrams.net/
Multilevel Mediation
We can also check for such a mediation model in a multilevel way. Currently, this is possible up until two levels. Here, we check for a simple example:
<- '
ML_SEM level:1
OUTCOME1 ~ VAR1 + VAR2
OUTCOME2 ~ VAR1 + VAR2
level:2
OUTCOME1 ~ VAR1 + VAR2
OUTCOME2 ~ VAR1 + VAR2
'
<- sem(model = ML_SEM,
fit data = dflong,
cluster = "ID", # group variable including dependent variance
optim.method = "em")
summary(fit,
fit.measures = TRUE,
standardize = TRUE,
rsquare = TRUE)
modificationIndices(fit, sort.=TRUE, minimum.value=3)
Cross-lagged panel model
Between-subject
flowchart LR A(Predictor X - time 1) -->|autoregression| B(Predictor X time - 2) A(Predictor X - time 1) -->|cross-lagged| C(Predictor Y - time 2) D(Predictor Y - time 1) -->B(Predictor X - time 2) D(Predictor Y - time 1) -->C(Predictor Y - time 2)
<-
model '
VAR1_T2 + VAR2_T2 ~ VAR1_T1 + VAR2_T1
'
<- sem(model, data = df)
fit summary(fit,
fit.measures = TRUE,
standardized = TRUE,
rsquare = TRUE)
modificationIndices(fit, sort.=TRUE, minimum.value=3)
Within-subject
Flournoy wrote an amazingly useful package to generate a syntax for a RI-CLPM in the riclpmr
package.
library(devtools)
# install_github('jflournoy/riclpmr') # in case you have not installed the package yet
library(riclpmr)
library(lavaan)
Select the variables from your wide dataset.
<- dfwide[,c("ID", # also important to include the grouping variable
data_riclpm 'VARIABLEX_1','VARIABLEX_2','VARIABLEX_3','VARIABLEX_4',
'VARIABLEY_1','VARIABLEY_2','VARIABLEY_3','VARIABLEY_4')]
# give different column names to make the output of the model more readable
colnames(data_riclpm) <- c("id",
'x1','x2','x3','x4',
'y1','y2','y3','y4')
<- data_riclpm[ , -c(1)] #remove ID
data_riclpm
# refer which columns belong to a specific variable
<- list(
var_groups x=c('x1','x2','x3','x4'),
y=c('y1','y2','y3','y4')
)
Just run the following code. Herein, a constrained and an unconstrained model are performed and compared (via ANOVA). Based on which model provides the best fit of the data, you can check the output of the model.
# construct contraint model
<- riclpmr::riclpm_text(var_groups,
model_text constrain_over_waves = TRUE,
constrain_ints = "free")
<- riclpmr::lavriclpm(riclpmModel = model_text,
fit_constraints data = data_riclpm,
missing = 'fiml',
meanstructure = T,
int.ov.free = T)
# construct unconstraint model
<- riclpmr::riclpm_text(var_groups,
model_text constrain_over_waves = FALSE,
constrain_ints = "free")
<- riclpmr::lavriclpm(riclpmModel = model_text,
fit_noconstraints data = data_riclpm,
missing = 'fiml',
meanstructure = T,
int.ov.free = T)
# run this to compare constraint and unconstraint model in terms of data fit
anova(fit_constraints,fit_noconstraints)
# Check the output of the chosenmodel
summary(fit_constraints,
fit.measures = TRUE,
standardize = TRUE,
rsquare = TRUE)