Takes a data.frame object and summarizes the columns into a ready to export, human-readable summary table. Capable of stratifying data and performing appropriate hypothesis testing.
Usage
# S3 method for data.frame
build_table(
.object,
...,
.by,
.inverse = FALSE,
.label.stat = TRUE,
.stat = c("mean", "median"),
.stat.pct.sign = FALSE,
.col.overall = TRUE,
.col.missing = FALSE,
.test.continuous = c("anova", "kruskal", "wilcoxon"),
.test.nominal = c("chisq", "fisher"),
.test.simulate.p = FALSE,
.col.test = FALSE,
.digits = 1,
.p.digits = 4
)
Arguments
- .object
A data.frame.
- ...
One or more unquoted expressions separated by commas representing columns in the data.frame. May be specified using
tidyselect helpers
. If left empty, all columns are summarized.- .by
An unquoted expression. The data column to stratify the summary by.
- .inverse
A logical. For logical data, report the frequency of FALSE values instead of the TRUE.
- .label.stat
A logical. Append the type of summary statistic to the column label.
- .stat
A character. Name of the summary statistic to use for numeric data. Supported options include the mean ('mean') and median ('median').
- .stat.pct.sign
A logical. Paste a percent symbol after all reported frequencies.
- .col.overall
A logical. Append a column with the statistic for all data. If
.by
is not specified, this parameter is ignored.- .col.missing
A logical. Append a column listing the frequencies of missing data for each row.
- .test.continuous
A character. A character. Name of statistical test to compare groups. Supported options include ANOVA linear model ('anova'), Kruskal-Wallis ('kruskal'), and Wilcoxon rank sum ('wilcoxon') tests.
- .test.nominal
A character. Name of statistical test to compare groups. Supported options include Pearson's Chi-squared Test ('chisq') and Fisher's Exact Test ('fisher').
- .test.simulate.p
A logical. Whether to use Monte Carlo simulation of the p-value when testing nominal data.
- .col.test
A logical. Append a column containing the test each p-value was derived from.
- .digits
An integer. The number of digits to round numbers to.
- .p.digits
An integer. The number of p-value digits to report.
Examples
# Sample data
df <- data.frame(
strata = factor(sample(letters[2:3], 1000, replace = TRUE)),
numeric = sample(1:100, 1000, replace = TRUE),
numeric2 = sample(1:100, 1000, replace = TRUE),
factor = factor(sample(1:5, 1000, replace = TRUE)),
logical = sample(c(TRUE,FALSE), 1000, replace = TRUE)
)
# Summarize all columns
build_table(df, .by = strata)
#> # A tibble: 10 × 5
#> Variable Overall b c p
#> <chr> <chr> <chr> <chr> <chr>
#> 1 "n(%)" "1000" "514 (51.4)" "486 (48.6)" ""
#> 2 "numeric, mean±SD" "52.6 ±29" "53.9 ±28.4" "51.4 ±29.5" "0.1698"
#> 3 "numeric2, mean±SD" "49.7 ±28.6" "48.5 ±29" "50.9 ±28.1" "0.1838"
#> 4 "factor, n(%)" "" "" "" "0.7716"
#> 5 " 1" "200 (20)" "105 (20.4)" "95 (19.5)" ""
#> 6 " 2" "204 (20.4)" "97 (18.9)" "107 (22)" ""
#> 7 " 3" "220 (22)" "113 (22)" "107 (22)" ""
#> 8 " 4" "193 (19.3)" "104 (20.2)" "89 (18.3)" ""
#> 9 " 5" "183 (18.3)" "95 (18.5)" "88 (18.1)" ""
#> 10 "logical, n(%)" "482 (48.2)" "248 (48.2)" "234 (48.1)" "1.0000"
# Summarize & rename selected columns
build_table(df, numeric2, factor, .by = strata)
#> # A tibble: 8 × 5
#> Variable Overall b c p
#> <chr> <chr> <chr> <chr> <chr>
#> 1 "n(%)" "1000" "514 (51.4)" "486 (48.6)" ""
#> 2 "numeric2, mean±SD" "49.7 ±28.6" "48.5 ±29" "50.9 ±28.1" "0.1838"
#> 3 "factor, n(%)" "" "" "" "0.7716"
#> 4 " 1" "200 (20)" "105 (20.4)" "95 (19.5)" ""
#> 5 " 2" "204 (20.4)" "97 (18.9)" "107 (22)" ""
#> 6 " 3" "220 (22)" "113 (22)" "107 (22)" ""
#> 7 " 4" "193 (19.3)" "104 (20.2)" "89 (18.3)" ""
#> 8 " 5" "183 (18.3)" "95 (18.5)" "88 (18.1)" ""