Content
Getting started
The first step on this LGM adventure is to make sure all necessary programs and packages are installed. If you do not have R and/or its user-friendly interface, RStudio, installed, please do so via this website.
Then, within R, you need to install and load the lavaan package, which you are going to fit LGMs with.
Assignment 1: Install and load the lavaan package.
if(!require("pacman")){install.packages("pacman",repos = "http://cran.us.r-project.org")}
pacman::p_load(lavaan, tidyverse, here, reshape2)
Now download the WISC data from the shared folder. Make sure that your R environment is linked to the folder in which you saved the data by setting your working directory.
Assignment 2: Load the WISC data into R and explore which data are in the data file.
path <- "/Users/njudd/projects/LGC_Workshop/" #change to the LGC_Workshop folder
setwd(path)
wisc <- read.csv(paste0(path,"wisc.csv"))[,-1]
head(wisc) #first 6 rows
## ID Verbal_T6 Verbal_T7 Verbal_T9 Verbal_T11 Pspeed_T6 Pspeed_T7 Pspeed_T9
## 1 0 24.41964 26.97917 39.61310 55.63988 19.835937 22.96932 43.90245
## 2 1 12.44048 14.37500 21.91964 37.81250 5.898771 13.43888 18.28666
## 3 2 32.42560 33.51190 34.30060 50.17857 27.638263 45.02193 46.99432
## 4 3 22.69345 28.39286 42.15774 44.71726 33.158129 29.67703 45.97218
## 5 4 28.22917 37.81250 41.05655 70.95238 27.638263 44.41775 65.47530
## 6 5 16.05655 20.11905 38.02083 39.94048 8.445562 15.77845 26.98717
## Pspeed_T11 Total_6 Total_7 Total_9 Total_11 age_T6 sex race mo_edu
## 1 44.18684 22.127790 24.97424 41.75777 49.91336 5.833333 1 1 4
## 2 40.38395 9.169624 13.90694 20.10315 39.09822 5.916667 2 2 6
## 3 77.71613 30.031929 39.26692 40.64746 63.94735 6.333333 1 1 2
## 4 61.66409 27.925791 29.03495 44.06496 53.19067 6.333333 2 1 2
## 5 64.21768 27.933715 41.11512 53.26593 67.58503 6.166667 1 1 3
## 6 39.08200 12.251055 17.94875 32.50400 39.51124 5.666667 1 1 2
## mo_educat fa_edu fa_educat
## 1 0 4 0
## 2 0 5 0
## 3 2 3 1
## 4 2 2 2
## 5 1 3 1
## 6 2 2 2
colnames(wisc) #names of columns
## [1] "ID" "Verbal_T6" "Verbal_T7" "Verbal_T9" "Verbal_T11"
## [6] "Pspeed_T6" "Pspeed_T7" "Pspeed_T9" "Pspeed_T11" "Total_6"
## [11] "Total_7" "Total_9" "Total_11" "age_T6" "sex"
## [16] "race" "mo_edu" "mo_educat" "fa_edu" "fa_educat"
dim(wisc) #number of rows and columns
## [1] 204 20
sapply(wisc,class) #number of each column
## ID Verbal_T6 Verbal_T7 Verbal_T9 Verbal_T11 Pspeed_T6
## "integer" "numeric" "numeric" "numeric" "numeric" "numeric"
## Pspeed_T7 Pspeed_T9 Pspeed_T11 Total_6 Total_7 Total_9
## "numeric" "numeric" "numeric" "numeric" "numeric" "numeric"
## Total_11 age_T6 sex race mo_edu mo_educat
## "numeric" "numeric" "integer" "integer" "integer" "integer"
## fa_edu fa_educat
## "character" "character"
For now, we are only going to analyze data from the verbal subtest (indicated by a column name that starts with “Verbal”).
Assignment 3: Subset the data such that the dataset you work with only contains the id numbers and the scores on the verbal subtest across the four measurements. Your subset should thus contain five columns.
wisc_verbal <- wisc[,c("ID","Verbal_T6","Verbal_T7","Verbal_T9","Verbal_T11")]
As you may have noticed when exploring the data, they are in wide format, that is, they contain one row per participant and the different measurements are in separate columns. To fit models in Lavaan, this wide format is necessary. Yet, for plotting, it is easier if data are in long format, with one row per participant per measurement.
Assignment 4: Reshape the data subset from wide to long format.
wisc_verbal_long <- wisc_verbal %>%
pivot_longer(!ID, names_to = "wave", values_to = "verbal") #test
Now that we have prepared the data we want to model, let’s plot them!
Assignment 5: Plot the data with the four measurements on the x-axis, the score on the verbal subtest on the y-axis, and a line for each subject.
wisc_verbal_long$wave = factor(wisc_verbal_long$wave, levels=c("Verbal_T6","Verbal_T7","Verbal_T9","Verbal_T11"))
ggplot(wisc_verbal_long, aes(wave, verbal, group=ID, fill=ID, color=ID)) +
geom_point() +
geom_line() +
theme_classic(base_size = 15) + # adding a classic theme; https://ggplot2.tidyverse.org/reference/ggtheme.html
theme(legend.position = "none") + # getting rid of legend
labs(x = "Wave", y = "Score on Verbal Subtest")
To enable you to check whether your model implementation is correct in a later step, it is good to first formulate expectations based on the plotted data.
Assignment 6: Describe what you see. What is the average score at each of the four measurements? Do subjects in- or decrease across measurements? Are there individual differences in these effects?
You’re ready to move on to do some model fitting now. Click here to go back to the top to move on to the next module.
Basic LGM in Lavaan
You are now going to actually fit an LGM in lavaan. We start simple.
Assignment 7: Start by creating an empty string that you call linear_growth_model. Then try to implement an LGM in which you estimate (1) intercepts for each of the four time points, and (2) a linear slope. See the slides and the cheat sheet at the top for examples and hints.
# Create LGM
linear_growth_model <- '
i =~ 1*Verbal_T6 + 1*Verbal_T7 + 1*Verbal_T9 + 1*Verbal_T11
s =~ 0*Verbal_T6 + 1*Verbal_T7 + 2*Verbal_T9 + 3*Verbal_T11'
Assignment 8: Fit the model you created in assignment 7 to the verbal subset data using the growth() function and plot results using summary().
The lavaan::growth function is a wrapper function (for
lavaan::lavaan) that simplifies the specification of growth models. See
details in the help file: ?lavaan::growth
, for more info
see ?lavaan::lavOptions
.
# Fit LGM
fit_linear_growth_model <- growth(linear_growth_model, data=wisc_verbal,missing='fiml')
# Output results
summary(fit_linear_growth_model, fit.measures = TRUE, rsquare = TRUE, standardized = TRUE)
## lavaan 0.6.17 ended normally after 65 iterations
##
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 9
##
## Number of observations 204
## Number of missing patterns 1
##
## Model Test User Model:
##
## Test statistic 100.756
## Degrees of freedom 5
## P-value (Chi-square) 0.000
##
## Model Test Baseline Model:
##
## Test statistic 585.906
## Degrees of freedom 6
## P-value 0.000
##
## User Model versus Baseline Model:
##
## Comparative Fit Index (CFI) 0.835
## Tucker-Lewis Index (TLI) 0.802
##
## Robust Comparative Fit Index (CFI) 0.835
## Robust Tucker-Lewis Index (TLI) 0.802
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -2530.194
## Loglikelihood unrestricted model (H1) -2479.816
##
## Akaike (AIC) 5078.388
## Bayesian (BIC) 5108.251
## Sample-size adjusted Bayesian (SABIC) 5079.736
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.306
## 90 Percent confidence interval - lower 0.256
## 90 Percent confidence interval - upper 0.360
## P-value H_0: RMSEA <= 0.050 0.000
## P-value H_0: RMSEA >= 0.080 1.000
##
## Robust RMSEA 0.306
## 90 Percent confidence interval - lower 0.256
## 90 Percent confidence interval - upper 0.360
## P-value H_0: Robust RMSEA <= 0.050 0.000
## P-value H_0: Robust RMSEA >= 0.080 1.000
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.113
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Observed
## Observed information based on Hessian
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## i =~
## Verbal_T6 1.000 4.354 0.775
## Verbal_T7 1.000 4.354 0.681
## Verbal_T9 1.000 4.354 0.583
## Verbal_T11 1.000 4.354 0.417
## s =~
## Verbal_T6 0.000 0.000 0.000
## Verbal_T7 1.000 1.251 0.196
## Verbal_T9 2.000 2.502 0.335
## Verbal_T11 3.000 3.752 0.360
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## i ~~
## s 5.081 1.079 4.709 0.000 0.933 0.933
##
## Intercepts:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## i 18.760 0.391 48.006 0.000 4.308 4.308
## s 7.291 0.192 38.007 0.000 5.829 5.829
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .Verbal_T6 12.600 2.175 5.793 0.000 12.600 0.399
## .Verbal_T7 10.213 1.463 6.982 0.000 10.213 0.250
## .Verbal_T9 10.243 1.941 5.277 0.000 10.243 0.184
## .Verbal_T11 45.410 5.781 7.855 0.000 45.410 0.417
## i 18.961 3.154 6.012 0.000 1.000 1.000
## s 1.565 0.658 2.379 0.017 1.000 1.000
##
## R-Square:
## Estimate
## Verbal_T6 0.601
## Verbal_T7 0.750
## Verbal_T9 0.816
## Verbal_T11 0.583
Assignment 9: How is model fit?
Assignment 10: What is the average verbal score at baseline? How does this compare to the expectations you formulated in assignment 6?
Assignment 11: What is the average change per wave?
Assignment 12: Are there individual differences in the score at baseline? And in how much individuals change?
Assignment 13: What does the score at baseline tell you about how much individuals change?
Great! You have interpreted your first LGM output. Let’s make it a little more difficult in the next module.
Different Shapes of Growth
In the previous module, we modeled a linear growth model. Yet, it is also possible to model non-linear growth in lavaan like a quadratic trajectory. For this you need to add a third parameter called a quadratic term that will get the same loadings as for the slope but squared. To do this, you need to specify one more latent variable in your model called quadratic term. The quadratic term is given loadings that are the squares of the loadings for the slope.
Assignment 14: Create a quadratic growth model, fit it to the verbal data, and output results.
# Create quadratic growth model
quad_growth_model <- 'i =~ 1*Verbal_T6 + 1*Verbal_T7 + 1*Verbal_T9 + 1*Verbal_T11
s =~ 0*Verbal_T6 + 1*Verbal_T7 + 2*Verbal_T9 + 3*Verbal_T11
q =~ 0*Verbal_T6 + 1*Verbal_T7 + 4*Verbal_T9 + 9*Verbal_T11'
# Fit model
fit_quad_growth_model <- growth(quad_growth_model, data=wisc_verbal,missing='fiml')
# Output results
summary(fit_quad_growth_model, fit.measures = TRUE, rsquare = TRUE, standardized = TRUE)
## lavaan 0.6.17 ended normally after 99 iterations
##
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 13
##
## Number of observations 204
## Number of missing patterns 1
##
## Model Test User Model:
##
## Test statistic 6.176
## Degrees of freedom 1
## P-value (Chi-square) 0.013
##
## Model Test Baseline Model:
##
## Test statistic 585.906
## Degrees of freedom 6
## P-value 0.000
##
## User Model versus Baseline Model:
##
## Comparative Fit Index (CFI) 0.991
## Tucker-Lewis Index (TLI) 0.946
##
## Robust Comparative Fit Index (CFI) 0.991
## Robust Tucker-Lewis Index (TLI) 0.946
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -2482.904
## Loglikelihood unrestricted model (H1) -2479.816
##
## Akaike (AIC) 4991.808
## Bayesian (BIC) 5034.943
## Sample-size adjusted Bayesian (SABIC) 4993.755
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.159
## 90 Percent confidence interval - lower 0.059
## 90 Percent confidence interval - upper 0.289
## P-value H_0: RMSEA <= 0.050 0.039
## P-value H_0: RMSEA >= 0.080 0.910
##
## Robust RMSEA 0.159
## 90 Percent confidence interval - lower 0.059
## 90 Percent confidence interval - upper 0.289
## P-value H_0: Robust RMSEA <= 0.050 0.039
## P-value H_0: Robust RMSEA >= 0.080 0.910
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.023
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Observed
## Observed information based on Hessian
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## i =~
## Verbal_T6 1.000 4.883 0.843
## Verbal_T7 1.000 4.883 0.800
## Verbal_T9 1.000 4.883 0.668
## Verbal_T11 1.000 4.883 0.459
## s =~
## Verbal_T6 0.000 NA NA
## Verbal_T7 1.000 NA NA
## Verbal_T9 2.000 NA NA
## Verbal_T11 3.000 NA NA
## q =~
## Verbal_T6 0.000 NA NA
## Verbal_T7 1.000 NA NA
## Verbal_T9 4.000 NA NA
## Verbal_T11 9.000 NA NA
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## i ~~
## s -0.564 6.937 -0.081 0.935 -0.064 -0.064
## q 2.014 1.811 1.112 0.266 0.738 0.738
## s ~~
## q 1.518 1.719 0.883 0.377 1.500 1.500
##
## Intercepts:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## i 19.697 0.409 48.124 0.000 4.033 4.033
## s 4.051 0.354 11.439 0.000 NA NA
## q 1.284 0.130 9.861 0.000 NA NA
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .Verbal_T6 9.730 6.209 1.567 0.117 9.730 0.290
## .Verbal_T7 11.059 2.297 4.814 0.000 11.059 0.297
## .Verbal_T9 9.542 2.677 3.564 0.000 9.542 0.179
## .Verbal_T11 29.417 11.518 2.554 0.011 29.417 0.260
## i 23.848 6.558 3.636 0.000 1.000 1.000
## s -3.277 7.053 -0.465 0.642 NA NA
## q -0.312 0.656 -0.476 0.634 NA NA
##
## R-Square:
## Estimate
## Verbal_T6 0.710
## Verbal_T7 0.703
## Verbal_T9 0.821
## Verbal_T11 0.740
#
Assignment 15: How is model fit?
Assignment 16: What is the average verbal score at baseline? Does this estimate differ from the estimate in assignment 10? And from your expectations?
Assignment 17: What is the shape of change across measurements?
Assignment 18: You have an error message. What does it mean? Can you see where is the issue in the output?
It is also possible to model non-linear growth in lavaan with no hypothesis on the shape. To do so, you fix the loadings of the first and last measurement, but freely estimate the middle ones.
Assignment 19: Create a non-linear (basis) growth model, fit it to the verbal data, and output results. Then compare the model fit with the previous models.
# Create non-linear growth model
basis_growth_model <- 'i =~ 1*Verbal_T6 + 1*Verbal_T7 + 1*Verbal_T9 + 1*Verbal_T11
s =~ 0*Verbal_T6 + Verbal_T7 + Verbal_T9 + 1*Verbal_T11'
# Fit model
fit_basis_growth_model <- growth(basis_growth_model, data=wisc_verbal,missing='fiml')
# Output results
summary(fit_basis_growth_model, fit.measures = TRUE, rsquare = TRUE, standardized = TRUE)
## lavaan 0.6.17 ended normally after 109 iterations
##
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 11
##
## Number of observations 204
## Number of missing patterns 1
##
## Model Test User Model:
##
## Test statistic 5.893
## Degrees of freedom 3
## P-value (Chi-square) 0.117
##
## Model Test Baseline Model:
##
## Test statistic 585.906
## Degrees of freedom 6
## P-value 0.000
##
## User Model versus Baseline Model:
##
## Comparative Fit Index (CFI) 0.995
## Tucker-Lewis Index (TLI) 0.990
##
## Robust Comparative Fit Index (CFI) 0.995
## Robust Tucker-Lewis Index (TLI) 0.990
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -2482.763
## Loglikelihood unrestricted model (H1) -2479.816
##
## Akaike (AIC) 4987.525
## Bayesian (BIC) 5024.024
## Sample-size adjusted Bayesian (SABIC) 4989.173
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.069
## 90 Percent confidence interval - lower 0.000
## 90 Percent confidence interval - upper 0.151
## P-value H_0: RMSEA <= 0.050 0.275
## P-value H_0: RMSEA >= 0.080 0.491
##
## Robust RMSEA 0.069
## 90 Percent confidence interval - lower 0.000
## 90 Percent confidence interval - upper 0.151
## P-value H_0: Robust RMSEA <= 0.050 0.275
## P-value H_0: Robust RMSEA >= 0.080 0.491
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.043
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Observed
## Observed information based on Hessian
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## i =~
## Verbal_T6 1.000 4.610 0.825
## Verbal_T7 1.000 4.610 0.731
## Verbal_T9 1.000 4.610 0.620
## Verbal_T11 1.000 4.610 0.446
## s =~
## Verbal_T6 0.000 0.000 0.000
## Verbal_T7 0.237 0.012 20.223 0.000 1.300 0.206
## Verbal_T9 0.536 0.012 43.295 0.000 2.937 0.395
## Verbal_T11 1.000 5.484 0.531
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## i ~~
## s 15.072 3.133 4.811 0.000 0.596 0.596
##
## Intercepts:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## i 19.634 0.391 50.221 0.000 4.259 4.259
## s 24.180 0.565 42.773 0.000 4.409 4.409
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .Verbal_T6 9.974 1.645 6.063 0.000 9.974 0.319
## .Verbal_T7 9.704 1.307 7.422 0.000 9.704 0.244
## .Verbal_T9 9.217 1.513 6.093 0.000 9.217 0.167
## .Verbal_T11 25.340 4.317 5.870 0.000 25.340 0.237
## i 21.255 2.898 7.335 0.000 1.000 1.000
## s 30.076 6.788 4.430 0.000 1.000 1.000
##
## R-Square:
## Estimate
## Verbal_T6 0.681
## Verbal_T7 0.756
## Verbal_T9 0.833
## Verbal_T11 0.763
Assignment 20: Perform a statistical test to compare model fit. Does the linear or non-linear model fit better?
# Compare model fit
anova(fit_basis_growth_model,fit_linear_growth_model)
##
## Chi-Squared Difference Test
##
## Df AIC BIC Chisq Chisq diff RMSEA Df diff
## fit_basis_growth_model 3 4987.5 5024.0 5.8934
## fit_linear_growth_model 5 5078.4 5108.3 100.7562 94.863 0.47708 2
## Pr(>Chisq)
## fit_basis_growth_model
## fit_linear_growth_model < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#Here it looks like the basis model fits better
Congrats! You can now choose the best shape for your trajectory. Let’s add predictors in the next module.
Predictors and outcomes
One may be interested in what predicts baseline scores and/or change. To assess this, one can add predictors in the growth model. One hypothesis could be that the level of education of the mother predicts the development of verbal comprehension.
Assignment 21: Add educational level of the mother (mo_edu in the data file) as predictor of baseline scores and change in the non-linear model.
# Specify model
basis_growth_model_cov <- '
i =~ 1*Verbal_T6 + 1*Verbal_T7 + 1*Verbal_T9 + 1*Verbal_T11
s =~ 0*Verbal_T6 + Verbal_T7 + Verbal_T9 + 1*Verbal_T11
s~mo_edu
i~mo_edu
'
Assignment 22: Fit the model from assignment 19 to the verbal data and output results. Does mother’s education predict baseline scores? And what about change across measurements?
# Fit model
fit_basis_growth_model_cov <- growth(basis_growth_model_cov, data=wisc,missing='fiml')
# Output results
summary(fit_basis_growth_model_cov, fit.measures = TRUE, rsquare = TRUE, standardized = TRUE)
## lavaan 0.6.17 ended normally after 118 iterations
##
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 13
##
## Number of observations 204
## Number of missing patterns 1
##
## Model Test User Model:
##
## Test statistic 6.498
## Degrees of freedom 5
## P-value (Chi-square) 0.261
##
## Model Test Baseline Model:
##
## Test statistic 650.266
## Degrees of freedom 10
## P-value 0.000
##
## User Model versus Baseline Model:
##
## Comparative Fit Index (CFI) 0.998
## Tucker-Lewis Index (TLI) 0.995
##
## Robust Comparative Fit Index (CFI) 0.998
## Robust Tucker-Lewis Index (TLI) 0.995
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -2450.885
## Loglikelihood unrestricted model (H1) -2447.636
##
## Akaike (AIC) 4927.770
## Bayesian (BIC) 4970.906
## Sample-size adjusted Bayesian (SABIC) 4929.718
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.038
## 90 Percent confidence interval - lower 0.000
## 90 Percent confidence interval - upper 0.110
## P-value H_0: RMSEA <= 0.050 0.520
## P-value H_0: RMSEA >= 0.080 0.210
##
## Robust RMSEA 0.038
## 90 Percent confidence interval - lower 0.000
## 90 Percent confidence interval - upper 0.110
## P-value H_0: Robust RMSEA <= 0.050 0.520
## P-value H_0: Robust RMSEA >= 0.080 0.210
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.038
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Observed
## Observed information based on Hessian
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## i =~
## Verbal_T6 1.000 4.615 0.826
## Verbal_T7 1.000 4.615 0.732
## Verbal_T9 1.000 4.615 0.619
## Verbal_T11 1.000 4.615 0.447
## s =~
## Verbal_T6 0.000 0.000 0.000
## Verbal_T7 0.237 0.012 20.312 0.000 1.306 0.207
## Verbal_T9 0.535 0.012 43.171 0.000 2.949 0.396
## Verbal_T11 1.000 5.508 0.534
##
## Regressions:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## s ~
## mo_edu -1.724 0.414 -4.165 0.000 -0.313 -0.394
## i ~
## mo_edu -1.943 0.259 -7.503 0.000 -0.421 -0.531
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .i ~~
## .s 9.676 2.721 3.556 0.000 0.489 0.489
##
## Intercepts:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .i 26.302 0.958 27.446 0.000 5.700 5.700
## .s 30.098 1.527 19.710 0.000 5.464 5.464
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .Verbal_T6 9.923 1.622 6.117 0.000 9.923 0.318
## .Verbal_T7 9.607 1.281 7.500 0.000 9.607 0.242
## .Verbal_T9 9.443 1.501 6.291 0.000 9.443 0.170
## .Verbal_T11 24.956 4.288 5.820 0.000 24.956 0.234
## .i 15.298 2.309 6.624 0.000 0.718 0.718
## .s 25.619 6.352 4.033 0.000 0.844 0.844
##
## R-Square:
## Estimate
## Verbal_T6 0.682
## Verbal_T7 0.758
## Verbal_T9 0.830
## Verbal_T11 0.766
## i 0.282
## s 0.156
Assignment 23: Add processing speed at 11 as an outcome of changes in verbal comprehension. In other words, test if the slopes of verbal change predict the level of processing speed at 11.
# Specify model
basis_growth_model_covO <- '
i =~ 1*Verbal_T6 + 1*Verbal_T7 + 1*Verbal_T9 + 1*Verbal_T11
s =~ 0*Verbal_T6 + Verbal_T7 + Verbal_T9 + 1*Verbal_T11
Pspeed_T11~s
Pspeed_T11~1
'
# Fit model
fit_basis_growth_model_covO <- growth(basis_growth_model_covO, data=wisc,missing='fiml')
# Output results
summary(fit_basis_growth_model_covO, fit.measures = TRUE, rsquare = TRUE, standardized = TRUE)
## lavaan 0.6.17 ended normally after 142 iterations
##
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 14
##
## Number of observations 204
## Number of missing patterns 1
##
## Model Test User Model:
##
## Test statistic 14.016
## Degrees of freedom 6
## P-value (Chi-square) 0.029
##
## Model Test Baseline Model:
##
## Test statistic 685.769
## Degrees of freedom 10
## P-value 0.000
##
## User Model versus Baseline Model:
##
## Comparative Fit Index (CFI) 0.988
## Tucker-Lewis Index (TLI) 0.980
##
## Robust Comparative Fit Index (CFI) 0.988
## Robust Tucker-Lewis Index (TLI) 0.980
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -3240.780
## Loglikelihood unrestricted model (H1) -3233.772
##
## Akaike (AIC) 6509.560
## Bayesian (BIC) 6556.014
## Sample-size adjusted Bayesian (SABIC) 6511.657
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.081
## 90 Percent confidence interval - lower 0.024
## 90 Percent confidence interval - upper 0.137
## P-value H_0: RMSEA <= 0.050 0.151
## P-value H_0: RMSEA >= 0.080 0.566
##
## Robust RMSEA 0.081
## 90 Percent confidence interval - lower 0.024
## 90 Percent confidence interval - upper 0.137
## P-value H_0: Robust RMSEA <= 0.050 0.151
## P-value H_0: Robust RMSEA >= 0.080 0.566
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.043
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Observed
## Observed information based on Hessian
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## i =~
## Verbal_T6 1.000 4.476 0.804
## Verbal_T7 1.000 4.476 0.714
## Verbal_T9 1.000 4.476 0.598
## Verbal_T11 1.000 4.476 0.434
## s =~
## Verbal_T6 0.000 0.000 0.000
## Verbal_T7 0.237 0.012 19.886 0.000 1.220 0.195
## Verbal_T9 0.534 0.013 42.125 0.000 2.752 0.367
## Verbal_T11 1.000 5.157 0.500
##
## Regressions:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## Pspeed_T11 ~
## s 1.683 0.219 7.690 0.000 8.680 0.697
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## i ~~
## s 17.195 2.513 6.842 0.000 0.745 0.745
##
## Intercepts:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .Pspeed_T11 10.214 5.368 1.903 0.057 10.214 0.820
## i 19.648 0.390 50.431 0.000 4.389 4.389
## s 24.194 0.554 43.672 0.000 4.691 4.691
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .Verbal_T6 10.997 1.451 7.578 0.000 10.997 0.354
## .Verbal_T7 9.658 1.299 7.436 0.000 9.658 0.246
## .Verbal_T9 10.154 1.529 6.640 0.000 10.154 0.181
## .Verbal_T11 25.254 3.694 6.836 0.000 25.254 0.238
## .Pspeed_T11 79.657 10.996 7.244 0.000 79.657 0.514
## i 20.036 2.643 7.580 0.000 1.000 1.000
## s 26.598 5.701 4.666 0.000 1.000 1.000
##
## R-Square:
## Estimate
## Verbal_T6 0.646
## Verbal_T7 0.754
## Verbal_T9 0.819
## Verbal_T11 0.762
## Pspeed_T11 0.486
Assignment 24: Perform the same steps as in assignments 20 and 21 but now for processing speed at baseline. Does processing speed relate to verbal baseline scores? And to change?
# Specify model
basis_growth_model_covP <- '
i =~ 1*Verbal_T6 + 1*Verbal_T7 + 1*Verbal_T9 + 1*Verbal_T11
s =~ 0*Verbal_T6 + Verbal_T7 + Verbal_T9 + 1*Verbal_T11
s~Pspeed_T6
i~Pspeed_T6
i~~s
'
# Fit model
fit_basis_growth_model_covP <- growth(basis_growth_model_covP, data=wisc,missing='fiml')
# Output results
summary(fit_basis_growth_model_covP, fit.measures = TRUE, rsquare = TRUE, standardized = TRUE)
## lavaan 0.6.17 ended normally after 92 iterations
##
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 13
##
## Number of observations 204
## Number of missing patterns 1
##
## Model Test User Model:
##
## Test statistic 8.737
## Degrees of freedom 5
## P-value (Chi-square) 0.120
##
## Model Test Baseline Model:
##
## Test statistic 715.505
## Degrees of freedom 10
## P-value 0.000
##
## User Model versus Baseline Model:
##
## Comparative Fit Index (CFI) 0.995
## Tucker-Lewis Index (TLI) 0.989
##
## Robust Comparative Fit Index (CFI) 0.995
## Robust Tucker-Lewis Index (TLI) 0.989
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -2419.385
## Loglikelihood unrestricted model (H1) -2415.016
##
## Akaike (AIC) 4864.769
## Bayesian (BIC) 4907.905
## Sample-size adjusted Bayesian (SABIC) 4866.717
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.061
## 90 Percent confidence interval - lower 0.000
## 90 Percent confidence interval - upper 0.126
## P-value H_0: RMSEA <= 0.050 0.331
## P-value H_0: RMSEA >= 0.080 0.367
##
## Robust RMSEA 0.061
## 90 Percent confidence interval - lower 0.000
## 90 Percent confidence interval - upper 0.126
## P-value H_0: Robust RMSEA <= 0.050 0.331
## P-value H_0: Robust RMSEA >= 0.080 0.367
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.044
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Observed
## Observed information based on Hessian
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## i =~
## Verbal_T6 1.000 4.644 0.834
## Verbal_T7 1.000 4.644 0.732
## Verbal_T9 1.000 4.644 0.622
## Verbal_T11 1.000 4.644 0.451
## s =~
## Verbal_T6 0.000 0.000 0.000
## Verbal_T7 0.237 0.012 20.308 0.000 1.325 0.209
## Verbal_T9 0.535 0.012 43.446 0.000 2.990 0.401
## Verbal_T11 1.000 5.588 0.542
##
## Regressions:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## s ~
## Pspeed_T6 0.349 0.061 5.724 0.000 0.062 0.521
## i ~
## Pspeed_T6 0.385 0.035 10.834 0.000 0.083 0.690
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .i ~~
## .s 5.323 2.473 2.152 0.031 0.332 0.332
##
## Intercepts:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .i 12.720 0.711 17.893 0.000 2.739 2.739
## .s 17.907 1.209 14.808 0.000 3.204 3.204
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .Verbal_T6 9.436 1.592 5.926 0.000 9.436 0.304
## .Verbal_T7 9.950 1.290 7.711 0.000 9.950 0.247
## .Verbal_T9 9.531 1.473 6.470 0.000 9.531 0.171
## .Verbal_T11 24.122 4.217 5.720 0.000 24.122 0.227
## .i 11.302 1.895 5.964 0.000 0.524 0.524
## .s 22.767 6.034 3.773 0.000 0.729 0.729
##
## R-Square:
## Estimate
## Verbal_T6 0.696
## Verbal_T7 0.753
## Verbal_T9 0.829
## Verbal_T11 0.773
## i 0.476
## s 0.271
Extra
Time-invariant predictors are predictors of the individual differences in intercepts and slopes.They are often measurement at baseline (e.g., family income) or person-specific characteristics where value is constant over time (e.g., biological sex, country of origin). For instance, in the previous assignments, level of education of the mother and processing speed at 6 are time-invariant predictors. Time-varying predictors are predictors of the outcome at each time point. In our example for instance we would need measurements at T6, T7, T9 and T11
Assignment 25: Use processing speed as a time-varying predictor of the verbal measurement at each time point. How are the intercept and slope of the verbal measures? Does processing speed predict verbal measures the same way across time points?
# Specify model
basis_growth_model_tvp <- '
i =~ 1*Verbal_T6 + 1*Verbal_T7 + 1*Verbal_T9 + 1*Verbal_T11
s =~ 0*Verbal_T6 + Verbal_T7 + Verbal_T9 + 1*Verbal_T11
Verbal_T6~Pspeed_T6
Verbal_T7~Pspeed_T7
Verbal_T9~Pspeed_T9
Verbal_T11~Pspeed_T11
'
# Fit LGM
fit_basis_growth_model_tvp <- growth(basis_growth_model_tvp, data=wisc,missing='fiml')
# Output results
summary(fit_basis_growth_model_tvp, fit.measures = TRUE, rsquare = TRUE, standardized = TRUE)
## lavaan 0.6.17 ended normally after 96 iterations
##
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 15
##
## Number of observations 204
## Number of missing patterns 1
##
## Model Test User Model:
##
## Test statistic 90.277
## Degrees of freedom 15
## P-value (Chi-square) 0.000
##
## Model Test Baseline Model:
##
## Test statistic 754.793
## Degrees of freedom 22
## P-value 0.000
##
## User Model versus Baseline Model:
##
## Comparative Fit Index (CFI) 0.897
## Tucker-Lewis Index (TLI) 0.849
##
## Robust Comparative Fit Index (CFI) 0.897
## Robust Tucker-Lewis Index (TLI) 0.849
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -2440.511
## Loglikelihood unrestricted model (H1) -2395.373
##
## Akaike (AIC) 4911.022
## Bayesian (BIC) 4960.794
## Sample-size adjusted Bayesian (SABIC) 4913.269
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.157
## 90 Percent confidence interval - lower 0.127
## 90 Percent confidence interval - upper 0.189
## P-value H_0: RMSEA <= 0.050 0.000
## P-value H_0: RMSEA >= 0.080 1.000
##
## Robust RMSEA 0.157
## 90 Percent confidence interval - lower 0.127
## 90 Percent confidence interval - upper 0.189
## P-value H_0: Robust RMSEA <= 0.050 0.000
## P-value H_0: Robust RMSEA >= 0.080 1.000
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.194
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Observed
## Observed information based on Hessian
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## i =~
## Verbal_T6 1.000 3.628 0.710
## Verbal_T7 1.000 3.628 0.627
## Verbal_T9 1.000 3.628 0.535
## Verbal_T11 1.000 3.628 0.386
## s =~
## Verbal_T6 0.000 0.000 0.000
## Verbal_T7 0.297 0.071 4.209 0.000 1.337 0.231
## Verbal_T9 0.703 0.108 6.531 0.000 3.161 0.466
## Verbal_T11 1.000 4.498 0.479
##
## Regressions:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## Verbal_T6 ~
## Pspeed_T6 0.243 0.038 6.467 0.000 0.243 0.397
## Verbal_T7 ~
## Pspeed_T7 0.230 0.034 6.853 0.000 0.230 0.397
## Verbal_T9 ~
## Pspeed_T9 0.220 0.036 6.130 0.000 0.220 0.332
## Verbal_T11 ~
## Pspeed_T11 0.319 0.039 8.233 0.000 0.319 0.423
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## i ~~
## s 5.873 2.723 2.157 0.031 0.360 0.360
##
## Intercepts:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## i 15.271 0.754 20.245 0.000 4.209 4.209
## s 12.341 1.994 6.190 0.000 2.744 2.744
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .Verbal_T6 8.814 1.705 5.170 0.000 8.814 0.338
## .Verbal_T7 9.806 1.311 7.478 0.000 9.806 0.292
## .Verbal_T9 9.583 1.978 4.846 0.000 9.583 0.208
## .Verbal_T11 27.351 4.282 6.387 0.000 27.351 0.310
## i 13.165 2.347 5.609 0.000 1.000 1.000
## s 20.235 6.197 3.265 0.001 1.000 1.000
##
## R-Square:
## Estimate
## Verbal_T6 0.662
## Verbal_T7 0.708
## Verbal_T9 0.792
## Verbal_T11 0.690
You’re ready to add a bit more complexity to those models. Click here to go back to the top to move on to the next module.
Uneven time intervals
Sometimes data collection does not happen every year, the intervals between two measures might be uneven. In the wisc data it seems that they measured the children at 6, 7, 9 and 11 years old.
Assignment 26: Perform the same steps as in assignments 7, 8 and 14 but take into account the uneven time intervals in your loadings. Compare the three fits of the three models. Do you have the same result as in the previous assignments?
# Specify linear model
linear_growth_model_uneven <- ' i =~ 1*Verbal_T6 + 1*Verbal_T7 + 1*Verbal_T9 + 1*Verbal_T11
s =~ 6*Verbal_T6 + 7*Verbal_T7 + 9*Verbal_T9 + 11*Verbal_T11'
# Fit LGM
fit_linear_growth_model_uneven <- growth(linear_growth_model_uneven, data=wisc_verbal,missing='fiml')
# Output results
summary(fit_linear_growth_model_uneven, fit.measures = TRUE, rsquare = TRUE, standardized = TRUE)
## lavaan 0.6.17 ended normally after 51 iterations
##
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 9
##
## Number of observations 204
## Number of missing patterns 1
##
## Model Test User Model:
##
## Test statistic 54.917
## Degrees of freedom 5
## P-value (Chi-square) 0.000
##
## Model Test Baseline Model:
##
## Test statistic 585.906
## Degrees of freedom 6
## P-value 0.000
##
## User Model versus Baseline Model:
##
## Comparative Fit Index (CFI) 0.914
## Tucker-Lewis Index (TLI) 0.897
##
## Robust Comparative Fit Index (CFI) 0.914
## Robust Tucker-Lewis Index (TLI) 0.897
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -2507.274
## Loglikelihood unrestricted model (H1) -2479.816
##
## Akaike (AIC) 5032.548
## Bayesian (BIC) 5062.411
## Sample-size adjusted Bayesian (SABIC) 5033.897
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.221
## 90 Percent confidence interval - lower 0.171
## 90 Percent confidence interval - upper 0.276
## P-value H_0: RMSEA <= 0.050 0.000
## P-value H_0: RMSEA >= 0.080 1.000
##
## Robust RMSEA 0.221
## 90 Percent confidence interval - lower 0.171
## 90 Percent confidence interval - upper 0.276
## P-value H_0: Robust RMSEA <= 0.050 0.000
## P-value H_0: Robust RMSEA >= 0.080 1.000
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.079
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Observed
## Observed information based on Hessian
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## i =~
## Verbal_T6 1.000 4.252 0.754
## Verbal_T7 1.000 4.252 0.679
## Verbal_T9 1.000 4.252 0.557
## Verbal_T11 1.000 4.252 0.415
## s =~
## Verbal_T6 6.000 5.774 1.024
## Verbal_T7 7.000 6.736 1.075
## Verbal_T9 9.000 8.660 1.134
## Verbal_T11 11.000 10.585 1.034
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## i ~~
## s -2.512 1.558 -1.612 0.107 -0.614 -0.614
##
## Intercepts:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## i -7.326 0.723 -10.128 0.000 -1.723 -1.723
## s 4.542 0.109 41.728 0.000 4.721 4.721
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .Verbal_T6 10.504 1.731 6.069 0.000 10.504 0.331
## .Verbal_T7 10.972 1.486 7.384 0.000 10.972 0.280
## .Verbal_T9 10.452 1.830 5.713 0.000 10.452 0.179
## .Verbal_T11 29.870 4.672 6.394 0.000 29.870 0.285
## i 18.076 11.774 1.535 0.125 1.000 1.000
## s 0.926 0.240 3.852 0.000 1.000 1.000
##
## R-Square:
## Estimate
## Verbal_T6 0.669
## Verbal_T7 0.720
## Verbal_T9 0.821
## Verbal_T11 0.715
# Specify quadratic model
quad_growth_model_uneven <- ' i =~ 1*Verbal_T6 + 1*Verbal_T7 + 1*Verbal_T9 + 1*Verbal_T11
s =~ 6*Verbal_T6 + 7*Verbal_T7 + 9*Verbal_T9 + 11*Verbal_T11
q =~ 36*Verbal_T6 + 49*Verbal_T7 + 81*Verbal_T9 + 121*Verbal_T11'
# Fit LGM
fit_quad_growth_model_uneven <- growth(quad_growth_model_uneven, data=wisc_verbal,missing='fiml')
# Output results
summary(fit_quad_growth_model_uneven, fit.measures = TRUE, rsquare = TRUE, standardized = TRUE)
## lavaan 0.6.17 ended normally after 100 iterations
##
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 13
##
## Number of observations 204
## Number of missing patterns 1
##
## Model Test User Model:
##
## Test statistic 39.104
## Degrees of freedom 1
## P-value (Chi-square) 0.000
##
## Model Test Baseline Model:
##
## Test statistic 585.906
## Degrees of freedom 6
## P-value 0.000
##
## User Model versus Baseline Model:
##
## Comparative Fit Index (CFI) 0.934
## Tucker-Lewis Index (TLI) 0.606
##
## Robust Comparative Fit Index (CFI) 0.934
## Robust Tucker-Lewis Index (TLI) 0.606
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -2499.368
## Loglikelihood unrestricted model (H1) -2479.816
##
## Akaike (AIC) 5024.736
## Bayesian (BIC) 5067.871
## Sample-size adjusted Bayesian (SABIC) 5026.684
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.432
## 90 Percent confidence interval - lower 0.323
## 90 Percent confidence interval - upper 0.553
## P-value H_0: RMSEA <= 0.050 0.000
## P-value H_0: RMSEA >= 0.080 1.000
##
## Robust RMSEA 0.432
## 90 Percent confidence interval - lower 0.323
## 90 Percent confidence interval - upper 0.553
## P-value H_0: Robust RMSEA <= 0.050 0.000
## P-value H_0: Robust RMSEA >= 0.080 1.000
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.064
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Observed
## Observed information based on Hessian
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## i =~
## Verbal_T6 1.000 NA NA
## Verbal_T7 1.000 NA NA
## Verbal_T9 1.000 NA NA
## Verbal_T11 1.000 NA NA
## s =~
## Verbal_T6 6.000 NA NA
## Verbal_T7 7.000 NA NA
## Verbal_T9 9.000 NA NA
## Verbal_T11 11.000 NA NA
## q =~
## Verbal_T6 36.000 NA NA
## Verbal_T7 49.000 NA NA
## Verbal_T9 81.000 NA NA
## Verbal_T11 121.000 NA NA
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## i ~~
## s 193.194 104.551 1.848 0.065 1.008 1.008
## q -11.954 6.434 -1.858 0.063 -1.006 -1.006
## s ~~
## q 3.244 1.706 1.902 0.057 1.006 1.006
##
## Intercepts:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## i -0.608 3.449 -0.176 0.860 NA NA
## s 2.842 0.851 3.338 0.001 NA NA
## q 0.102 0.052 1.967 0.049 NA NA
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .Verbal_T6 11.332 4.050 2.798 0.005 11.332 0.335
## .Verbal_T7 12.827 2.401 5.342 0.000 12.827 0.334
## .Verbal_T9 10.707 4.073 2.628 0.009 10.707 0.199
## .Verbal_T11 42.441 19.671 2.157 0.031 42.441 0.373
## i -706.425 428.026 -1.650 0.099 NA NA
## s -51.974 26.584 -1.955 0.051 NA NA
## q -0.200 0.114 -1.760 0.078 NA NA
##
## R-Square:
## Estimate
## Verbal_T6 0.665
## Verbal_T7 0.666
## Verbal_T9 0.801
## Verbal_T11 0.627
#You should get the warning message "In lav_object_post_check(object) : lavaan WARNING: some estimated lv variances are negative"
# Compare linear model fit between the model not taking into account the uneven intervals and the one taking into account the uneven intervals
anova(fit_linear_growth_model_uneven,fit_linear_growth_model)
##
## Chi-Squared Difference Test
##
## Df AIC BIC Chisq Chisq diff RMSEA
## fit_linear_growth_model_uneven 5 5032.5 5062.4 54.916
## fit_linear_growth_model 5 5078.4 5108.3 100.756 45.84 0
## Df diff Pr(>Chisq)
## fit_linear_growth_model_uneven
## fit_linear_growth_model 0
#You cannot compare these models
# Compare model fit between linear trajectory with uneven intervals and the basis model
anova(fit_linear_growth_model_uneven,fit_basis_growth_model)
##
## Chi-Squared Difference Test
##
## Df AIC BIC Chisq Chisq diff RMSEA
## fit_basis_growth_model 3 4987.5 5024.0 5.8934
## fit_linear_growth_model_uneven 5 5032.5 5062.4 54.9165 49.023 0.33949
## Df diff Pr(>Chisq)
## fit_basis_growth_model
## fit_linear_growth_model_uneven 2 2.263e-11 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#Here it looks like the basis model is preferred.
A recommendation is that, if time intervals are unequal, it may be better to use measured time.
Intercept and slopes relations
Now that you know how to estimate the trajectory of one variable you are able to estimate the trajectory of two variables and see how they interact.
Assignment 27: Create two non-linear (basis) growth models, one for verbal and one for processing speed. Correlate the changes of the two metrics. Are their slopes correlated?
# Specify model
basis_growth_model_cor_ver_pro <- '
i_verbal =~ 1*Verbal_T6 + 1*Verbal_T7 + 1*Verbal_T9 + 1*Verbal_T11
s_verbal =~ 0*Verbal_T6 + Verbal_T7 + Verbal_T9 + 1*Verbal_T11
i_processpeed =~ 1*Pspeed_T6 + 1*Pspeed_T7 + 1*Pspeed_T9 + 1*Pspeed_T11
s_processpeed =~ 0*Pspeed_T6 + Pspeed_T7 + Pspeed_T9 + 1*Pspeed_T11
s_verbal ~~ s_processpeed'
# Fit LGM
fit_basis_growth_model_cor_ver_pro <- growth(basis_growth_model_cor_ver_pro, data=wisc,missing='fiml')
# Output results
summary(fit_basis_growth_model_cor_ver_pro, fit.measures = TRUE, rsquare = TRUE, standardized = TRUE)
## lavaan 0.6.17 ended normally after 211 iterations
##
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 26
##
## Number of observations 204
## Number of missing patterns 1
##
## Model Test User Model:
##
## Test statistic 29.305
## Degrees of freedom 18
## P-value (Chi-square) 0.045
##
## Model Test Baseline Model:
##
## Test statistic 1423.083
## Degrees of freedom 28
## P-value 0.000
##
## User Model versus Baseline Model:
##
## Comparative Fit Index (CFI) 0.992
## Tucker-Lewis Index (TLI) 0.987
##
## Robust Comparative Fit Index (CFI) 0.992
## Robust Tucker-Lewis Index (TLI) 0.987
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -5124.285
## Loglikelihood unrestricted model (H1) -5109.632
##
## Akaike (AIC) 10300.570
## Bayesian (BIC) 10386.841
## Sample-size adjusted Bayesian (SABIC) 10304.465
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.055
## 90 Percent confidence interval - lower 0.009
## 90 Percent confidence interval - upper 0.091
## P-value H_0: RMSEA <= 0.050 0.367
## P-value H_0: RMSEA >= 0.080 0.137
##
## Robust RMSEA 0.055
## 90 Percent confidence interval - lower 0.009
## 90 Percent confidence interval - upper 0.091
## P-value H_0: Robust RMSEA <= 0.050 0.367
## P-value H_0: Robust RMSEA >= 0.080 0.137
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.048
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Observed
## Observed information based on Hessian
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## i_verbal =~
## Verbal_T6 1.000 4.637 0.832
## Verbal_T7 1.000 4.637 0.734
## Verbal_T9 1.000 4.637 0.617
## Verbal_T11 1.000 4.637 0.451
## s_verbal =~
## Verbal_T6 0.000 0.000 0.000
## Verbal_T7 0.237 0.012 20.495 0.000 1.349 0.213
## Verbal_T9 0.533 0.012 43.283 0.000 3.037 0.404
## Verbal_T11 1.000 5.694 0.554
## i_processpeed =~
## Pspeed_T6 1.000 7.604 0.902
## Pspeed_T7 1.000 7.604 0.799
## Pspeed_T9 1.000 7.604 0.713
## Pspeed_T11 1.000 7.604 0.617
## s_processpeed =~
## Pspeed_T6 0.000 0.000 0.000
## Pspeed_T7 0.298 0.011 26.220 0.000 1.841 0.194
## Pspeed_T9 0.648 0.012 53.375 0.000 4.005 0.376
## Pspeed_T11 1.000 6.183 0.502
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## s_verbal ~~
## s_processpeed 16.813 4.806 3.499 0.000 0.478 0.478
## i_verbal ~~
## s_verbal 14.606 3.112 4.694 0.000 0.553 0.553
## i_processpeed 26.537 3.572 7.430 0.000 0.753 0.753
## s_processpeed 2.520 3.154 0.799 0.424 0.088 0.088
## s_verbal ~~
## i_processpeed 25.796 4.875 5.291 0.000 0.596 0.596
## i_processpeed ~~
## s_processpeed 14.974 5.451 2.747 0.006 0.319 0.319
##
## Intercepts:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## i_verbal 19.642 0.390 50.360 0.000 4.236 4.236
## s_verbal 24.200 0.561 43.138 0.000 4.250 4.250
## i_processpeed 17.949 0.590 30.419 0.000 2.360 2.360
## s_processpeed 32.986 0.615 53.609 0.000 5.335 5.335
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .Verbal_T6 9.574 1.581 6.055 0.000 9.574 0.308
## .Verbal_T7 9.698 1.269 7.643 0.000 9.698 0.243
## .Verbal_T9 10.149 1.491 6.806 0.000 10.149 0.180
## .Verbal_T11 22.419 4.039 5.551 0.000 22.419 0.212
## .Pspeed_T6 13.286 2.911 4.565 0.000 13.286 0.187
## .Pspeed_T7 20.338 2.534 8.026 0.000 20.338 0.225
## .Pspeed_T9 20.430 2.945 6.937 0.000 20.430 0.180
## .Pspeed_T11 25.840 5.200 4.969 0.000 25.840 0.170
## i_verbal 21.502 2.893 7.432 0.000 1.000 1.000
## s_verbal 32.425 6.990 4.639 0.000 1.000 1.000
## i_processpeed 57.821 6.889 8.393 0.000 1.000 1.000
## s_processpeed 38.226 8.968 4.263 0.000 1.000 1.000
##
## R-Square:
## Estimate
## Verbal_T6 0.692
## Verbal_T7 0.757
## Verbal_T9 0.820
## Verbal_T11 0.788
## Pspeed_T6 0.813
## Pspeed_T7 0.775
## Pspeed_T9 0.820
## Pspeed_T11 0.830
Assignment 28: Within that model, explore how one metrics baseline level predicts the changes in the other. How do they predict each other?
# Specify model
basis_growth_model_pred_ver_pro <- '
i_verbal =~ 1*Verbal_T6 + 1*Verbal_T7 + 1*Verbal_T9 + 1*Verbal_T11
s_verbal =~ 0*Verbal_T6 + Verbal_T7 + Verbal_T9 + 1*Verbal_T11
i_processpeed =~ 1*Pspeed_T6 + 1*Pspeed_T7 + 1*Pspeed_T9 + 1*Pspeed_T11
s_processpeed =~ 0*Pspeed_T6 + Pspeed_T7 + Pspeed_T9 + 1*Pspeed_T11
s_verbal ~ i_processpeed
s_processpeed ~ i_verbal'
# Fit LGM
fit_basis_growth_model_pred_ver_pro <- growth(basis_growth_model_pred_ver_pro, data=wisc,missing='fiml')
# Output results
summary(fit_basis_growth_model_pred_ver_pro, fit.measures = TRUE, rsquare = TRUE, standardized = TRUE)
## lavaan 0.6.17 ended normally after 175 iterations
##
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 24
##
## Number of observations 204
## Number of missing patterns 1
##
## Model Test User Model:
##
## Test statistic 38.158
## Degrees of freedom 20
## P-value (Chi-square) 0.008
##
## Model Test Baseline Model:
##
## Test statistic 1423.083
## Degrees of freedom 28
## P-value 0.000
##
## User Model versus Baseline Model:
##
## Comparative Fit Index (CFI) 0.987
## Tucker-Lewis Index (TLI) 0.982
##
## Robust Comparative Fit Index (CFI) 0.987
## Robust Tucker-Lewis Index (TLI) 0.982
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -5128.711
## Loglikelihood unrestricted model (H1) -5109.632
##
## Akaike (AIC) 10305.422
## Bayesian (BIC) 10385.057
## Sample-size adjusted Bayesian (SABIC) 10309.018
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.067
## 90 Percent confidence interval - lower 0.033
## 90 Percent confidence interval - upper 0.099
## P-value H_0: RMSEA <= 0.050 0.181
## P-value H_0: RMSEA >= 0.080 0.268
##
## Robust RMSEA 0.067
## 90 Percent confidence interval - lower 0.033
## 90 Percent confidence interval - upper 0.099
## P-value H_0: Robust RMSEA <= 0.050 0.181
## P-value H_0: Robust RMSEA >= 0.080 0.268
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.055
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Observed
## Observed information based on Hessian
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## i_verbal =~
## Verbal_T6 1.000 4.809 0.856
## Verbal_T7 1.000 4.809 0.760
## Verbal_T9 1.000 4.809 0.647
## Verbal_T11 1.000 4.809 0.477
## s_verbal =~
## Verbal_T6 0.000 0.000 0.000
## Verbal_T7 0.238 0.011 21.065 0.000 1.421 0.225
## Verbal_T9 0.534 0.012 43.708 0.000 3.189 0.429
## Verbal_T11 1.000 5.977 0.593
## i_processpeed =~
## Pspeed_T6 1.000 7.861 0.929
## Pspeed_T7 1.000 7.861 0.831
## Pspeed_T9 1.000 7.861 0.756
## Pspeed_T11 1.000 7.861 0.662
## s_processpeed =~
## Pspeed_T6 0.000 0.000 0.000
## Pspeed_T7 0.299 0.011 26.953 0.000 2.078 0.220
## Pspeed_T9 0.648 0.012 53.898 0.000 4.495 0.432
## Pspeed_T11 1.000 6.940 0.585
##
## Regressions:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## s_verbal ~
## i_processpeed 0.408 0.071 5.782 0.000 0.537 0.537
## s_processpeed ~
## i_verbal 0.143 0.142 1.013 0.311 0.099 0.099
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## i_verbal ~~
## i_processpeed 26.674 3.649 7.310 0.000 0.706 0.706
## .s_verbal ~~
## .s_processpeed 10.954 5.052 2.168 0.030 0.315 0.315
##
## Intercepts:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## i_verbal 19.630 0.393 49.926 0.000 4.082 4.082
## .s_verbal 16.877 1.368 12.333 0.000 2.823 2.823
## i_processpeed 17.944 0.592 30.305 0.000 2.283 2.283
## .s_processpeed 30.167 2.843 10.612 0.000 4.347 4.347
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .Verbal_T6 8.445 1.407 6.000 0.000 8.445 0.267
## .Verbal_T7 9.715 1.287 7.547 0.000 9.715 0.243
## .Verbal_T9 10.375 1.493 6.951 0.000 10.375 0.188
## .Verbal_T11 21.001 3.922 5.355 0.000 21.001 0.207
## .Pspeed_T6 9.777 2.479 3.943 0.000 9.777 0.137
## .Pspeed_T7 21.117 2.640 8.000 0.000 21.117 0.236
## .Pspeed_T9 21.159 3.017 7.014 0.000 21.159 0.196
## .Pspeed_T11 23.241 5.106 4.552 0.000 23.241 0.165
## i_verbal 23.129 2.847 8.125 0.000 1.000 1.000
## .s_verbal 25.426 5.629 4.517 0.000 0.712 0.712
## i_processpeed 61.801 6.889 8.971 0.000 1.000 1.000
## .s_processpeed 47.685 8.194 5.820 0.000 0.990 0.990
##
## R-Square:
## Estimate
## Verbal_T6 0.733
## Verbal_T7 0.757
## Verbal_T9 0.812
## Verbal_T11 0.793
## Pspeed_T6 0.863
## Pspeed_T7 0.764
## Pspeed_T9 0.804
## Pspeed_T11 0.835
## s_verbal 0.288
## s_processpeed 0.010
Let’s finish with simulation of data. Click here to go back to the top to move on to the next module.
LEAP data [extra]
Here we will fit a LGM to synthetic LEAP data!
Assignment 29: Load LEAP data & try to make a plot
HINT: You might need the melt function from the package reshape2!
pacman::p_load(reshape2, ggrain)
leap_synthdata <- read.csv(paste0(path,"leap_synthdata.csv"))
# I thought this was a subject ID... it clearly is not
leap_synthdata[leap_synthdata$subjects == 949903300427,]
## subjects t1_diagnosis t1_timepoint t1_sex t1_css_total t1_sa_css
## 61 949903300427 Control T1 Male NA NA
## 90 949903300427 Control T1 Male NA NA
## 124 949903300427 ASD T1 Male 6 6
## 471 949903300427 ASD T1 Male 2 1
## 582 949903300427 Control T1 Female NA NA
## 597 949903300427 Control T1 Male NA NA
## t1_rrb_css t1_srs_rawscore_self t1_rbs_total t1_ssp_total t1_handedness
## 61 NA NA NA NA <NA>
## 90 NA NA NA NA <NA>
## 124 8 77 NA 100 <NA>
## 471 6 NA 14 139 Right(3)
## 582 NA NA 1 NA Right(3)
## 597 NA NA NA NA <NA>
## t2_diagnosis t2_timepoint t2_sex t2_css_total t2_sa_css t2_rrb_css
## 61 Control T2 Male NA NA NA
## 90 Control T2 Male NA NA NA
## 124 ASD T2 Male NA NA NA
## 471 ASD T2 Male 7 7 7
## 582 Control T2 Female NA NA NA
## 597 Control T2 Male NA NA NA
## t2_srs_rawscore_self t2_rbs_total t2_ssp_total t2_handedness t3_diagnosis
## 61 NA NA NA <NA> <NA>
## 90 NA NA NA <NA> <NA>
## 124 NA NA NA <NA> ASD
## 471 NA 6 171 Right(3) ASD
## 582 NA 0 NA Right(3) Control
## 597 NA NA NA <NA> <NA>
## t3_timepoint t3_sex t3_css_total t3_sa_css t3_rrb_css t3_srs_rawscore_self
## 61 <NA> <NA> NA NA NA NA
## 90 <NA> <NA> NA NA NA NA
## 124 T3 Male 5 7 1 44
## 471 T3 Male 1 1 1 37
## 582 T3 Female NA NA NA NA
## 597 <NA> <NA> NA NA NA NA
## t3_rbs_total t3_ssp_total t3_handedness t1_ageyrs t2_ageyrs t3_ageyrs
## 61 NA NA <NA> 21.541945 23.00983 NA
## 90 NA NA <NA> 10.532062 NA NA
## 124 2 159 <NA> 19.633160 NA NA
## 471 4 188 Right(3) 24.040020 NA NA
## 582 NA NA Right(3) 11.974636 12.89433 NA
## 597 NA NA <NA> 9.216495 NA NA
## t1_fsiq t2_fsiq t3_fsiq t1_site t2_site t3_site
## 61 61.92220 62.17697 NA KCL KCL <NA>
## 90 75.81032 76.09765 NA KCL KCL <NA>
## 124 46.10777 45.85930 NA KCL KCL <NA>
## 471 117.49570 117.18867 NA Rome <NA> <NA>
## 582 122.18080 122.29568 NA Mannheim Mannheim <NA>
## 597 102.14100 101.90749 NA KCL KCL <NA>
# now we make the data frame long with melt on ssp
leap_synthdata_long <- melt(leap_synthdata, id.vars = "subjects", measure.vars = colnames(leap_synthdata)[str_detect(colnames(leap_synthdata), "ssp")])
# plotting the data
ggplot(leap_synthdata_long, aes(variable, value, group = subjects, color = subjects)) +
geom_point() +
geom_line() +
theme_classic(base_size = 15) +
theme(legend.position = "none") +
labs(x = "Wave", y = "Score on ssp")
Assignment 30: Make a latent growth curve model with ssp
fit_lgm_ssp<- '
i_ssp =~ 1*t1_ssp_total + 1*t2_ssp_total + 1*t3_ssp_total
s_ssp =~ 0*t1_ssp_total + 1*t2_ssp_total + 2*t3_ssp_total
t1_ssp_total~~ervar1*t1_ssp_total
t2_ssp_total~~ervar2*t2_ssp_total
t3_ssp_total~~ervar3*t3_ssp_total
'
fit_fit_lgm_ssp <- growth(fit_lgm_ssp, data=leap_synthdata,missing='fiml',estimator='mlr')
# Output results
summary(fit_fit_lgm_ssp, fit.measures=TRUE, standardized=TRUE, rsquare=TRUE,ci=T)
## lavaan 0.6.17 ended normally after 164 iterations
##
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 8
##
## Used Total
## Number of observations 458 768
## Number of missing patterns 7
##
## Model Test User Model:
## Standard Scaled
## Test Statistic 0.647 0.665
## Degrees of freedom 1 1
## P-value (Chi-square) 0.421 0.415
## Scaling correction factor 0.973
## Yuan-Bentler correction (Mplus variant)
##
## Model Test Baseline Model:
##
## Test statistic 201.505 194.337
## Degrees of freedom 3 3
## P-value 0.000 0.000
## Scaling correction factor 1.037
##
## User Model versus Baseline Model:
##
## Comparative Fit Index (CFI) 1.000 1.000
## Tucker-Lewis Index (TLI) 1.005 1.005
##
## Robust Comparative Fit Index (CFI) 1.000
## Robust Tucker-Lewis Index (TLI) 1.005
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -3454.537 -3454.537
## Scaling correction factor 1.107
## for the MLR correction
## Loglikelihood unrestricted model (H1) -3454.213 -3454.213
## Scaling correction factor 1.092
## for the MLR correction
##
## Akaike (AIC) 6925.073 6925.073
## Bayesian (BIC) 6958.088 6958.088
## Sample-size adjusted Bayesian (SABIC) 6932.699 6932.699
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.000 0.000
## 90 Percent confidence interval - lower 0.000 0.000
## 90 Percent confidence interval - upper 0.114 0.000
## P-value H_0: RMSEA <= 0.050 0.635 0.624
## P-value H_0: RMSEA >= 0.080 0.176 0.185
##
## Robust RMSEA 0.000
## 90 Percent confidence interval - lower 0.000
## 90 Percent confidence interval - upper 0.206
## P-value H_0: Robust RMSEA <= 0.050 0.493
## P-value H_0: Robust RMSEA >= 0.080 0.406
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.012 0.012
##
## Parameter Estimates:
##
## Standard errors Sandwich
## Information bread Observed
## Observed information based on Hessian
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) ci.lower ci.upper
## i_ssp =~
## t1_ssp_total 1.000 1.000 1.000
## t2_ssp_total 1.000 1.000 1.000
## t3_ssp_total 1.000 1.000 1.000
## s_ssp =~
## t1_ssp_total 0.000 0.000 0.000
## t2_ssp_total 1.000 1.000 1.000
## t3_ssp_total 2.000 2.000 2.000
## Std.lv Std.all
##
## 28.248 0.945
## 28.248 1.018
## 28.248 1.220
##
## 0.000 0.000
## 10.564 0.381
## 21.128 0.913
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) ci.lower ci.upper
## i_ssp ~~
## s_ssp -188.578 62.315 -3.026 0.002 -310.714 -66.442
## Std.lv Std.all
##
## -0.632 -0.632
##
## Intercepts:
## Estimate Std.Err z-value P(>|z|) ci.lower ci.upper
## i_ssp 152.407 1.479 103.070 0.000 149.509 155.305
## s_ssp 4.988 0.898 5.557 0.000 3.229 6.748
## Std.lv Std.all
## 5.395 5.395
## 0.472 0.472
##
## Variances:
## Estimate Std.Err z-value P(>|z|) ci.lower ci.upper
## .t1_ssp_ (erv1) 94.982 88.258 1.076 0.282 -77.999 267.964
## .t2_ssp_ (erv2) 237.819 38.717 6.142 0.000 161.935 313.703
## .t3_ssp_ (erv3) 45.856 64.824 0.707 0.479 -81.197 172.908
## i_ssp 797.961 99.885 7.989 0.000 602.191 993.731
## s_ssp 111.596 41.507 2.689 0.007 30.243 192.949
## Std.lv Std.all
## 94.982 0.106
## 237.819 0.309
## 45.856 0.086
## 1.000 1.000
## 1.000 1.000
##
## R-Square:
## Estimate
## t1_ssp_total 0.894
## t2_ssp_total 0.691
## t3_ssp_total 0.914
Assignment 31: Choose a exogenous variable and see how it impacts the change of ssp!
Simulate your data [extra]
Data simulation is a useful tool to test your models and do power analysis,
Assignment 32: Simulate a linear growth model with 4 waves. The intercepts should have a mean of 100 and a variance of 15. The slope should have a mean of 5 and a variance of 10. The intercept-slope covariance should have a mean of 3.
# Specify linear model
sim_lgm_4waves<- '
i =~ 1*t1 + 1*t2 + 1*t3 + 1*t4
s =~ 0*t1 + 1*t2 + 2*t3 + 3*t4
i~100*1 ##<## this specifies the intercept, or the mean at T1
i~~15*i ##<## this specifies the intercept variance at T1
s~5*1 ##<## this specifies the slope, or change per wave
s~~10*s ##<## this specifies the slope variance
i~~-3*s ##<## this specifies the intercept/slope covariance
'
sim_lgm_4waves_dat <- simulateData(sim_lgm_4waves, sample.nobs = 500)
#Sanity check of the column means. As expected the scores increase ~ 5 points per wave
colMeans(sim_lgm_4waves_dat)
## t1 t2 t3 t4
## 99.72662 104.54699 109.46029 114.34819
#Plotting. Now you can start tweaking parameters to see what the data looks like
sim_lgm_4waves_dat$id<-as.factor(1:nrow(sim_lgm_4waves_dat))
theme_set(theme_grey(base_size = 18)) #increase text size
plotdat<-melt(sim_lgm_4waves_dat)
ggplot(plotdat, aes(variable,value,group=id, fill=id, color=id)) +
geom_point() +
geom_line() +
theme_classic(base_size = 15) + # adding a classic theme; https://ggplot2.tidyverse.org/reference/ggtheme.html
theme(legend.position = "none") + # getting rid of legend
labs(x = "Wave", y = "Score")
# Specify linear model
fit_lgm_4waves<- '
i =~ 1*t1 + 1*t2 + 1*t3 + 1*t4
s =~ 0*t1 + 1*t2 + 2*t3 + 3*t4
i~~s
'
# Fit LGM
fit_fit_lgm_4waves <- growth(fit_lgm_4waves, data=sim_lgm_4waves_dat,missing='fiml')
# Output results
summary(fit_fit_lgm_4waves, fit.measures=TRUE, standardized=TRUE, rsquare=TRUE,ci=T)
## lavaan 0.6.17 ended normally after 153 iterations
##
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 9
##
## Number of observations 500
## Number of missing patterns 1
##
## Model Test User Model:
##
## Test statistic 1.620
## Degrees of freedom 5
## P-value (Chi-square) 0.899
##
## Model Test Baseline Model:
##
## Test statistic 3109.900
## Degrees of freedom 6
## P-value 0.000
##
## User Model versus Baseline Model:
##
## Comparative Fit Index (CFI) 1.000
## Tucker-Lewis Index (TLI) 1.001
##
## Robust Comparative Fit Index (CFI) 1.000
## Robust Tucker-Lewis Index (TLI) 1.001
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -4814.731
## Loglikelihood unrestricted model (H1) -4813.921
##
## Akaike (AIC) 9647.463
## Bayesian (BIC) 9685.394
## Sample-size adjusted Bayesian (SABIC) 9656.828
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.000
## 90 Percent confidence interval - lower 0.000
## 90 Percent confidence interval - upper 0.027
## P-value H_0: RMSEA <= 0.050 0.992
## P-value H_0: RMSEA >= 0.080 0.000
##
## Robust RMSEA 0.000
## 90 Percent confidence interval - lower 0.000
## 90 Percent confidence interval - upper 0.027
## P-value H_0: Robust RMSEA <= 0.050 0.992
## P-value H_0: Robust RMSEA >= 0.080 0.000
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.004
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Observed
## Observed information based on Hessian
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) ci.lower ci.upper
## i =~
## t1 1.000 1.000 1.000
## t2 1.000 1.000 1.000
## t3 1.000 1.000 1.000
## t4 1.000 1.000 1.000
## s =~
## t1 0.000 0.000 0.000
## t2 1.000 1.000 1.000
## t3 2.000 2.000 2.000
## t4 3.000 3.000 3.000
## Std.lv Std.all
##
## 3.913 0.977
## 3.913 0.867
## 3.913 0.579
## 3.913 0.409
##
## 0.000 0.000
## 3.247 0.719
## 6.494 0.961
## 9.742 1.017
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) ci.lower ci.upper
## i ~~
## s -3.216 0.604 -5.323 0.000 -4.400 -2.032
## Std.lv Std.all
##
## -0.253 -0.253
##
## Intercepts:
## Estimate Std.Err z-value P(>|z|) ci.lower ci.upper
## i 99.708 0.178 559.488 0.000 99.358 100.057
## s 4.877 0.146 33.329 0.000 4.590 5.163
## Std.lv Std.all
## 25.481 25.481
## 1.502 1.502
##
## Variances:
## Estimate Std.Err z-value P(>|z|) ci.lower ci.upper
## .t1 0.722 0.279 2.587 0.010 0.175 1.270
## .t2 0.966 0.119 8.121 0.000 0.733 1.200
## .t3 1.049 0.144 7.259 0.000 0.765 1.332
## .t4 0.776 0.344 2.259 0.024 0.103 1.449
## i 15.312 1.014 15.098 0.000 13.324 17.300
## s 10.545 0.677 15.570 0.000 9.217 11.872
## Std.lv Std.all
## 0.722 0.045
## 0.966 0.047
## 1.049 0.023
## 0.776 0.008
## 1.000 1.000
## 1.000 1.000
##
## R-Square:
## Estimate
## t1 0.955
## t2 0.953
## t3 0.977
## t4 0.992
parTable(fit_fit_lgm_4waves) #Parameters are recovered nicely
## id lhs op rhs user block group free ustart exo label plabel start est
## 1 1 i =~ t1 1 1 1 0 1 0 .p1. 1.000 1.000
## 2 2 i =~ t2 1 1 1 0 1 0 .p2. 1.000 1.000
## 3 3 i =~ t3 1 1 1 0 1 0 .p3. 1.000 1.000
## 4 4 i =~ t4 1 1 1 0 1 0 .p4. 1.000 1.000
## 5 5 s =~ t1 1 1 1 0 0 0 .p5. 0.000 0.000
## 6 6 s =~ t2 1 1 1 0 1 0 .p6. 1.000 1.000
## 7 7 s =~ t3 1 1 1 0 2 0 .p7. 2.000 2.000
## 8 8 s =~ t4 1 1 1 0 3 0 .p8. 3.000 3.000
## 9 9 i ~~ s 1 1 1 1 NA 0 .p9. 0.000 -3.216
## 10 10 t1 ~~ t1 0 1 1 2 NA 0 .p10. 8.028 0.722
## 11 11 t2 ~~ t2 0 1 1 3 NA 0 .p11. 10.091 0.966
## 12 12 t3 ~~ t3 0 1 1 4 NA 0 .p12. 22.894 1.049
## 13 13 t4 ~~ t4 0 1 1 5 NA 0 .p13. 45.877 0.776
## 14 14 i ~~ i 0 1 1 6 NA 0 .p14. 0.050 15.312
## 15 15 s ~~ s 0 1 1 7 NA 0 .p15. 0.050 10.545
## 16 16 t1 ~1 0 1 1 0 0 0 .p16. 0.000 0.000
## 17 17 t2 ~1 0 1 1 0 0 0 .p17. 0.000 0.000
## 18 18 t3 ~1 0 1 1 0 0 0 .p18. 0.000 0.000
## 19 19 t4 ~1 0 1 1 0 0 0 .p19. 0.000 0.000
## 20 20 i ~1 0 1 1 8 NA 0 .p20. 0.000 99.708
## 21 21 s ~1 0 1 1 9 NA 0 .p21. 0.000 4.877
## se
## 1 0.000
## 2 0.000
## 3 0.000
## 4 0.000
## 5 0.000
## 6 0.000
## 7 0.000
## 8 0.000
## 9 0.604
## 10 0.279
## 11 0.119
## 12 0.144
## 13 0.344
## 14 1.014
## 15 0.677
## 16 0.000
## 17 0.000
## 18 0.000
## 19 0.000
## 20 0.178
## 21 0.146
You might want to simulate different means for the intercepts and slopes of different groups (e.g., female and male). You can simulate those groups at the same time by using a list: c().
Assignment 33: Simulate a linear growth model with three groups.
# Specify linear model
sim_lgm_4waves_MG<- '
i =~ 1*t1 + 1*t2 + 1*t3 + 1*t4
s =~ 0*t1 + 1*t2 + 2*t3 + 3*t4
i~c(100,90,80)*1 ##<## this specifies the intercept, or the mean at T1
i~~c(15,15,2)*i ##<## this specifies the intercept variance at T1
s~c(4,0,10)*1 ##<## this specifies the slope, or change per wave
s~~c(10,10,5)*s ##<## this specifies the slope variance
i~~c(3,3,3)*s ##<## this specifies the intercept/slope covariance
'
sim_sim_lgm_4waves_MG <- simulateData(sim_lgm_4waves_MG, sample.nobs = c(100,100,20))
#Sanity check of the column means.
colMeans(sim_sim_lgm_4waves_MG)
## t1 t2 t3 t4 group
## 94.083697 97.181676 100.267913 103.514015 1.636364
# Specify linear model
fit_lgm_4waves_MG<- '
i =~ 1*t1 + 1*t2 + 1*t3 + 1*t4
s =~ 0*t1 + 1*t2 + 2*t3 + 3*t4
'
# Fit LGM
fit_sim_sim_lgm_4waves_MG <- growth(fit_lgm_4waves_MG, data=sim_sim_lgm_4waves_MG,group='group')
# Output results
summary(fit_sim_sim_lgm_4waves_MG, fit.measures=TRUE, standardized=TRUE, rsquare=TRUE,ci=T)
## lavaan 0.6.17 ended normally after 388 iterations
##
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 27
##
## Number of observations per group:
## 1 100
## 2 100
## 3 20
##
## Model Test User Model:
##
## Test statistic 10.907
## Degrees of freedom 15
## P-value (Chi-square) 0.759
## Test statistic for each group:
## 1 5.397
## 2 3.880
## 3 1.629
##
## Model Test Baseline Model:
##
## Test statistic 1680.125
## Degrees of freedom 18
## P-value 0.000
##
## User Model versus Baseline Model:
##
## Comparative Fit Index (CFI) 1.000
## Tucker-Lewis Index (TLI) 1.003
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -2038.603
## Loglikelihood unrestricted model (H1) -2033.150
##
## Akaike (AIC) 4131.206
## Bayesian (BIC) 4222.834
## Sample-size adjusted Bayesian (SABIC) 4137.271
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.000
## 90 Percent confidence interval - lower 0.000
## 90 Percent confidence interval - upper 0.078
## P-value H_0: RMSEA <= 0.050 0.869
## P-value H_0: RMSEA >= 0.080 0.047
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.017
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Expected
## Information saturated (h1) model Structured
##
##
## Group 1 [1]:
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) ci.lower ci.upper
## i =~
## t1 1.000 1.000 1.000
## t2 1.000 1.000 1.000
## t3 1.000 1.000 1.000
## t4 1.000 1.000 1.000
## s =~
## t1 0.000 0.000 0.000
## t2 1.000 1.000 1.000
## t3 2.000 2.000 2.000
## t4 3.000 3.000 3.000
## Std.lv Std.all
##
## 3.590 0.950
## 3.590 0.637
## 3.590 0.423
## 3.590 0.311
##
## 0.000 0.000
## 3.308 0.587
## 6.616 0.780
## 9.924 0.861
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) ci.lower ci.upper
## i ~~
## s 3.530 1.275 2.770 0.006 1.032 6.028
## Std.lv Std.all
##
## 0.297 0.297
##
## Intercepts:
## Estimate Std.Err z-value P(>|z|) ci.lower ci.upper
## i 100.991 0.370 273.198 0.000 100.267 101.716
## s 4.775 0.333 14.346 0.000 4.123 5.427
## Std.lv Std.all
## 28.135 28.135
## 1.443 1.443
##
## Variances:
## Estimate Std.Err z-value P(>|z|) ci.lower ci.upper
## .t1 1.385 0.534 2.594 0.009 0.338 2.432
## .t2 0.859 0.234 3.679 0.000 0.402 1.317
## .t3 1.214 0.301 4.041 0.000 0.625 1.803
## .t4 0.286 0.654 0.437 0.662 -0.996 1.569
## i 12.885 1.937 6.652 0.000 9.089 16.681
## s 10.943 1.570 6.971 0.000 7.867 14.020
## Std.lv Std.all
## 1.385 0.097
## 0.859 0.027
## 1.214 0.017
## 0.286 0.002
## 1.000 1.000
## 1.000 1.000
##
## R-Square:
## Estimate
## t1 0.903
## t2 0.973
## t3 0.983
## t4 0.998
##
##
## Group 2 [2]:
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) ci.lower ci.upper
## i =~
## t1 1.000 1.000 1.000
## t2 1.000 1.000 1.000
## t3 1.000 1.000 1.000
## t4 1.000 1.000 1.000
## s =~
## t1 0.000 0.000 0.000
## t2 1.000 1.000 1.000
## t3 2.000 2.000 2.000
## t4 3.000 3.000 3.000
## Std.lv Std.all
##
## 3.752 0.977
## 3.752 0.664
## 3.752 0.452
## 3.752 0.335
##
## 0.000 0.000
## 3.142 0.556
## 6.285 0.757
## 9.427 0.842
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) ci.lower ci.upper
## i ~~
## s 3.632 1.257 2.889 0.004 1.168 6.096
## Std.lv Std.all
##
## 0.308 0.308
##
## Intercepts:
## Estimate Std.Err z-value P(>|z|) ci.lower ci.upper
## i 89.934 0.381 235.767 0.000 89.186 90.682
## s 0.266 0.316 0.840 0.401 -0.354 0.886
## Std.lv Std.all
## 23.968 23.968
## 0.085 0.085
##
## Variances:
## Estimate Std.Err z-value P(>|z|) ci.lower ci.upper
## .t1 0.658 0.425 1.547 0.122 -0.176 1.492
## .t2 0.698 0.186 3.749 0.000 0.333 1.063
## .t3 0.802 0.240 3.343 0.001 0.332 1.273
## .t4 0.729 0.560 1.301 0.193 -0.369 1.826
## i 14.080 2.067 6.813 0.000 10.029 18.131
## s 9.874 1.417 6.969 0.000 7.097 12.651
## Std.lv Std.all
## 0.658 0.045
## 0.698 0.022
## 0.802 0.012
## 0.729 0.006
## 1.000 1.000
## 1.000 1.000
##
## R-Square:
## Estimate
## t1 0.955
## t2 0.978
## t3 0.988
## t4 0.994
##
##
## Group 3 [3]:
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) ci.lower ci.upper
## i =~
## t1 1.000 1.000 1.000
## t2 1.000 1.000 1.000
## t3 1.000 1.000 1.000
## t4 1.000 1.000 1.000
## s =~
## t1 0.000 0.000 0.000
## t2 1.000 1.000 1.000
## t3 2.000 2.000 2.000
## t4 3.000 3.000 3.000
## Std.lv Std.all
##
## 1.380 0.886
## 1.380 0.457
## 1.380 0.294
## 1.380 0.217
##
## 0.000 0.000
## 1.702 0.563
## 3.404 0.724
## 5.106 0.804
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) ci.lower ci.upper
## i ~~
## s 1.952 0.712 2.743 0.006 0.557 3.347
## Std.lv Std.all
##
## 0.831 0.831
##
## Intercepts:
## Estimate Std.Err z-value P(>|z|) ci.lower ci.upper
## i 79.827 0.336 237.726 0.000 79.169 80.485
## s 9.432 0.388 24.316 0.000 8.671 10.192
## Std.lv Std.all
## 57.861 57.861
## 5.542 5.542
##
## Variances:
## Estimate Std.Err z-value P(>|z|) ci.lower ci.upper
## .t1 0.520 0.294 1.768 0.077 -0.057 1.098
## .t2 0.429 0.189 2.267 0.023 0.058 0.799
## .t3 0.795 0.376 2.114 0.035 0.058 1.532
## .t4 0.597 0.646 0.923 0.356 -0.670 1.864
## i 1.903 0.727 2.618 0.009 0.478 3.328
## s 2.896 0.954 3.036 0.002 1.027 4.766
## Std.lv Std.all
## 0.520 0.215
## 0.429 0.047
## 0.795 0.036
## 0.597 0.015
## 1.000 1.000
## 1.000 1.000
##
## R-Square:
## Estimate
## t1 0.785
## t2 0.953
## t3 0.964
## t4 0.985
#Plotting. Now you can start tweaking parameters to see what the data looks like
sim_sim_lgm_4waves_MG$id<-as.factor(1:nrow(sim_sim_lgm_4waves_MG))
sim_sim_lgm_4waves_MG$group<-as.factor(sim_sim_lgm_4waves_MG$group)
theme_set(theme_grey(base_size = 18)) #increase text size
plotdat<-melt(sim_sim_lgm_4waves_MG,id=c("id","group"))
ggplot(plotdat, aes(variable,value,group=id)) +
geom_point() +
geom_line() +
theme_classic(base_size = 15) + # adding a classic theme; https://ggplot2.tidyverse.org/reference/ggtheme.html
theme(legend.position = "none") + # getting rid of legend
labs(x = "Wave", y = "Score") +
facet_wrap(~group)
Simulating data can also be used for power analysis to test what effect size would be detected with your sample size and your model.
Assignment 34: Use the previous simulate data model to test the effect size of a predictor on the slope.
#General specifications
nsims<-100 #Define the number of iterations
samplesize<- 400 #Define the sample size
modelcomp<-data.frame(matrix(NA,nsims,7)) #create empty object to store logicals for whether the correct model is preferred
colnames(modelcomp)<-c('effect size','#LRT delta chi square','LRT delta df','LRT pv-value','delta AIC','AIC logical')
# Create a matrix for the effect size tested
effectsize <- c(0,0.05,0.1,0.15,0.2,0.25,0.3)
power_effectsize<-data.frame(matrix(NA,length(effectsize),4))
colnames(power_effectsize)<-c('effect_size','powerLRT','powerAIC','powerAICthres')
# Display the resulting data frame
print(effectsize)
## [1] 0.00 0.05 0.10 0.15 0.20 0.25 0.30
a=0
for (j in 1:length(effectsize)) {
effectsize[j]
for (i in 1:nsims) #initiate loop
{
model_test<-NA #change model output back to NA
model_test_constraint<-NA #change model output back to NA
#Specify the true model
#Simulate data for a bivariate Latent Change Score model
model_sim_lgm_4waves<-
paste('i =~ 1*t1 + 1*t2 + 1*t3 + 1*t4
s =~ 0*t1 + 1*t2 + 2*t3 + 3*t4
i~100*1 ##<## this specifies the intercept, or the mean at T1
i~~15*i ##<## this specifies the intercept variance at T1
s~5*1 ##<## this specifies the slope, or change per wave
s~~10*s ##<## this specifies the slope variance
i~~-3*s ##<## this specifies the intercept/slope covariance
pred~10*1 ##<## this specifies the pred, or the mean at T1
pred~~5*pred ##<## this specifies the pred variance at T1
s~ ',effectsize[j],'*pred
')
#Simulate data
simdat_lgm_4waves<-simulateData(model_sim_lgm_4waves,sample.nobs = samplesize,meanstructure = T,empirical=TRUE)
#check if the parameters make sense with the number you provided
colMeans(simdat_lgm_4waves)
model_test <- '
i =~ 1*t1 + 1*t2 + 1*t3 + 1*t4
s =~ 0*t1 + 1*t2 + 2*t3 + 3*t4
i~~s
s~ a*pred
'
fit_model_test <- growth(model_test, data=simdat_lgm_4waves, estimator='mlr',missing='fiml')
model_test_constraint <- '
i =~ 1*t1 + 1*t2 + 1*t3 + 1*t4
s =~ 0*t1 + 1*t2 + 2*t3 + 3*t4
i~~s
s~ 0*pred
'
fit_model_test_constraint <- growth(model_test_constraint, data=simdat_lgm_4waves, estimator='mlr',missing='fiml')
#Likelihood ratio test
LRTcomp<-anova(fit_model_test,fit_model_test_constraint)
modelaAIC<-AIC(fit_model_test) #grab AIC for the true model fit
modelbAIC<-AIC(fit_model_test_constraint) #grab AIC for the competitor model fit
modelcomp[i+a,1]<-effectsize[j]
modelcomp[i+a,2]<-LRTcomp[2,5] #LRT chi square
modelcomp[i+a,3]<-LRTcomp[2,6] #LRT delta df
modelcomp[i+a,4]<-LRTcomp[2,7] #LRT p val
modelcomp[i+a,5]<-modelbAIC-modelaAIC #delta AIC: positive means A preferred over B
modelcomp[i+a,6]<-modelaAIC<modelbAIC #logical, does the alternative model have a
}
power_effectsize[j,1]<- effectsize[j]
powerLRT<-(sum(modelcomp[c((a+1):(a+100)),4]<0.05)/nsims)*100 #sum logical: divide by nsims to get percentage power
power_effectsize[j,2]<- powerLRT
powerAIC<-(sum(as.numeric(modelcomp[c((a+1):(a+100)),6]))/nsims)*100
power_effectsize[j,3]<- powerAIC
AICthres<-10
powerAICthres<-sum(modelcomp[c((a+1):(a+100)),5]>10)/nsims*100
power_effectsize[j,4]<- powerAICthres
a=a+100
}
ggplot(power_effectsize,aes(effect_size,powerLRT))+
geom_point()