Automatically determine types of each variable (continuous/binary/ternary/truncated) in a data matrix.

get_types(X, tru_prop = 0.05)

Arguments

X

A numeric data matrix (n by p), where n is number of samples, and p is number of variables. Missing values (NA) are allowed.

tru_prop

A scalar between 0 and 1 indicating the minimal proportion of zeros that should be present in a variable to be treated as "tru" (truncated type or zero-inflated) rather than as "con" (continuous type). The default value is 0.05 (any variable with more than 5% of zero values among n samples is treated as truncated or zero-inflated)

Value

get_types returns

  • types: A vector of length p indicating the type of each of the p variables in X. Each element is one of "con" (continuous), "bin" (binary), "ter" (ternary) or "tru" (truncated).

Examples

X = gen_data(types = c("ter", "con"))$X
get_types(X)
#> [1] "ter" "con"