lm() with standardised continuous variables

This runs lm() after standardising all continuous variables, while leaving factors intact.

lm_std(formula, data = NULL, weights = NULL, ...)

Arguments

formula

an object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted. The details of model specification are given under ‘Details’.

data

an optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables in the model. If not found in data, the variables are taken from environment(formula), typically the environment from which lm is called.

weights

an optional vector of weights to be used in the fitting process. Should be NULL or a numeric vector. If non-NULL, weighted least squares is used with weights weights (that is, minimizing sum(w*e^2)); otherwise ordinary least squares is used. See also ‘Details’,

...

Arguments passed on to stats::lm

na.action: a function which indicates what should happen when the data contain NAs. The default is set by the na.action setting of options, and is na.fail if that is unset. The ‘factory-fresh’ default is na.omit. Another possible value is NULL, no action. Value na.exclude can be useful.
method: the method to be used; for fitting, currently only method = "qr" is supported; method = "model.frame" returns the model frame (the same as with model = TRUE, see below).
model,x,y,qr: logicals. If TRUE the corresponding components of the fit (the model frame, the model matrix, the response, the QR decomposition) are returned.
singular.ok: logical. If FALSE (the default in S but not in R) a singular fit is an error.
contrasts: an optional list. See the contrasts.arg of model.matrix.default.
offset: this can be used to specify an a priori known component to be included in the linear predictor during fitting. This should be NULL or a numeric vector or matrix of extents matching those of the response. One or more offset terms can be included in the formula instead or as well, and if more than one are specified their sum is used. See model.offset.

Details

In the model call, the weights variable will always be called .weights. This might pose a problem when you update the model later on, for the moment the only workaround is to rename the weights variable accordingly (or to fix it and contribute a PR on Github).

References

See (Fox, 2015) for an argument why dummy variables should never be standardised. If you want to run a model with all variables standardised, one option is QuantPsyc::lm.beta()

Examples

lm_std(Sepal.Length ~ Sepal.Width + Species, iris)
#> 
#> Call:
#> lm(formula = Sepal.Length ~ Sepal.Width + Species, data = data)
#> 
#> Coefficients:
#>       (Intercept)        Sepal.Width  Speciesversicolor   Speciesvirginica  
#>            -1.371              0.423              1.762              2.351  
#>