R/make_scales.R
make_scale.Rd
This function creates a scale by calculating the mean of a set of items,
and prints and returns descriptives that allow to assess internal consistency
and spread. It is primarily based on the psych::alpha
function, with
more parsimonious output and some added functionality. It also allows to specify
a threshold for proration so that missing data can be easily and explicitly
dealt with.
A dataframe
Character vector with names of scale items (variables in data)
Name of the scale
Should scale items be reverse coded? One of "auto" - items are
reversed if that contributes to scale consistency, "none" - no items reversed,
or "spec" - items specific in reverse_items
are reversed.
Character vector with names of scale items to be reversed (must be subset of scale_items)
How should the reliability of two-item scales be reported? "spearman_brown" is the recommended default, but "cronbachs_alpha" and Pearson's "r" are also supported.
(optional) Numeric. Set to the possible maximum value of the scale if the whole scale should be reversed, or to -1 to reverse the scale based on the observed maximum.
Scales scores are only calculated for cases with at most this share of missing data - see details.
Logical. Should histograms for items and resulting scale be printed?
Logical. Should descriptives for scales be printed?
Logical. Should only scale values be returned, or descriptives as well?
Should items that have different ranges be rescaled? Default is not to do it but issue a message to flag this potential issue - set to FALSE to suppress that message. If TRUE, items are rescaled to match the first item given. Alternatively pass a vector (c(min, max)) to specify the desired range.
Depends on return_list
argument. Either just the scale values,
or a list of scale values and descriptives. If descriptives are returned, check the text
element for a convenient summary.
Proration is the easiest way to deal with missing data at the item-level. Here,
scale means are calculated for cases that have less than a given share of missing data.
According to Wu et al. (2022).
a 40% cut-off is defensible, so this is the default used. For more precise estimation,
consider item-level multiple imputation, which can be done with make_scale_mi()
.
scores <- make_scale(ess_health, scale_items = c("etfruit", "eatveg"),
scale_name = "Healthy eating")
#>
#> Descriptives for Healthy eating scale:
#> Mean: 3.033 SD: 1.107
#> spearman_brown: 0.66