Create a scale based on multiple imputation at item level

This function creates a scale based on multiple imputation at item level and returns pooled descriptive statistics (including Cronbach's alpha). Note that this function only supports Cronbach's alpha, including for two-item scales.

make_scale_mi(
  data,
  scale_items,
  scale_name,
  proration_cutoff = 0,
  seed = NULL,
  alpha_ci = FALSE,
  boot = 5000,
  parallel = TRUE,
  ...
)

Source

The approach to pooling Cronbach's alpha is taken from Dion Groothof on StackOverflow. The development of the function was motivated by Gottschall, West & Enders (2012) who showed that multiple imputation at item level results in much higher statistical power than multiple imputation at scale level.

Arguments

data

A dataframe containing multiple imputations, distinguished by a .imp variable. Typically the output from mice::complete(mids, "long").

scale_items

Character vector with names of scale items (variables in data)

scale_name

Name of the scale

proration_cutoff

Applies only to raw data (.imp == 0) in data. Scales scores are only calculated for cases with at most this share of missing data.

seed

For pooling, the variance of Cronbach's alpha is bootstrapped. Set a seed to make this reproducible.

alpha_ci

Should a confidence interval for Cronbach's alpha be returned? Note that this requires bootstrapping and thus makes the function much slower. TRUE corresponds to a 95% confidence interval, other widths can be specified as fractions, e.g., .9

boot

For pooling, the variance of Cronbach's alpha is bootstrapped. Set number of bootstrap resamples here.

parallel

Should bootstrapping be conducted in parallel (using parallel-package)? Pass a number to select the number of cores - otherwise, the function will use all but one core.

...

Arguments passed on to make_scale

reverse: Should scale items be reverse coded? One of "auto" - items are reversed if that contributes to scale consistency, "none" - no items reversed, or "spec" - items specific in reverse_items are reversed.
reverse_items: Character vector with names of scale items to be reversed (must be subset of scale_items)
r_key: (optional) Numeric. Set to the possible maximum value of the scale if the whole scale should be reversed, or to -1 to reverse the scale based on the observed maximum.
print_desc: Logical. Should descriptives for scales be printed?
return_list: Logical. Should only scale values be returned, or descriptives as well?
harmonize_ranges: Should items that have different ranges be rescaled? Default is not to do it but issue a message to flag this potential issue - set to FALSE to suppress that message. If TRUE, items are rescaled to match the first item given. Alternatively pass a vector (c(min, max)) to specify the desired range.

Details

Scale scores are returned for the raw data as well (if it is included in data). Descriptive statistics and reliability estimates are only based on the imputed datasets.

Examples

library(dplyr)
library(mice)

# Create Dataset with missing data
ess_health <- ess_health %>% sample_n(500) %>% select(etfruit, eatveg , dosprt, health)
add_missing <- function(x) {x[!rbinom(length(x), 1, .9)] <- NA; x}
ess_health <- ess_health %>% mutate(across(everything(), add_missing))

# Impute data
ess_health_mi <- mice(ess_health, printFlag = FALSE) 
ess_health_mi <- complete(ess_health_mi, "long")

scale <- make_scale_mi(ess_health_mi, c("etfruit", "eatveg"), "healthy")
#>     
#> Descriptives for healthy scale:
#> Mean: 2.950  SD: 1.134
#> Cronbach's alpha: 0.72