Turns a categorical variable into a tibble of n-1 dummy-coded values. If x is a factor, the first level is omitted and thus treated as the reference level (to match the behavior of lm() and related functions). If x is not a factor, the first value in x is treated as the reference level. Variable names returned include a common prefix and a cleaned up version of the factor levels (without special characters and in snake_case).
dummy_code(x, prefix = NA, drop_first = TRUE, verbose = interactive())
The categorical variable to be dummy-coded
String to be pre-fixed to the new variable names (typically the name of the variable that is dummy-coded). If NULL, variables are just named with the levels. If left as NA, the function will try to extract the name of the variable passed.
Should first level be dropped? Defaults to TRUE
Should message with reference level be displayed?
dummy_code(iris$Species)
#> # A tibble: 150 × 2
#> species_versicolor species_virginica
#> <lgl> <lgl>
#> 1 FALSE FALSE
#> 2 FALSE FALSE
#> 3 FALSE FALSE
#> 4 FALSE FALSE
#> 5 FALSE FALSE
#> 6 FALSE FALSE
#> 7 FALSE FALSE
#> 8 FALSE FALSE
#> 9 FALSE FALSE
#> 10 FALSE FALSE
#> # ℹ 140 more rows