Skip to contents

Takes a data.frame object and summarizes the columns into a ready to export, human-readable summary table. Capable of stratifying data and performing appropriate hypothesis testing.

Usage

# S3 method for data.frame
build_table(
  .object,
  ...,
  .by,
  .inverse = FALSE,
  .label.stat = TRUE,
  .stat = c("mean", "median"),
  .stat.pct.sign = FALSE,
  .col.overall = TRUE,
  .col.missing = FALSE,
  .test.continuous = c("anova", "kruskal", "wilcoxon"),
  .test.nominal = c("chisq", "fisher"),
  .test.simulate.p = FALSE,
  .col.test = FALSE,
  .digits = 1,
  .p.digits = 4
)

Arguments

.object

A data.frame.

...

One or more unquoted expressions separated by commas representing columns in the data.frame. May be specified using tidyselect helpers. If left empty, all columns are summarized.

.by

An unquoted expression. The data column to stratify the summary by.

.inverse

A logical. For logical data, report the frequency of FALSE values instead of the TRUE.

.label.stat

A logical. Append the type of summary statistic to the column label.

.stat

A character. Name of the summary statistic to use for numeric data. Supported options include the mean ('mean') and median ('median').

.stat.pct.sign

A logical. Paste a percent symbol after all reported frequencies.

.col.overall

A logical. Append a column with the statistic for all data. If .by is not specified, this parameter is ignored.

.col.missing

A logical. Append a column listing the frequencies of missing data for each row.

.test.continuous

A character. A character. Name of statistical test to compare groups. Supported options include ANOVA linear model ('anova'), Kruskal-Wallis ('kruskal'), and Wilcoxon rank sum ('wilcoxon') tests.

.test.nominal

A character. Name of statistical test to compare groups. Supported options include Pearson's Chi-squared Test ('chisq') and Fisher's Exact Test ('fisher').

.test.simulate.p

A logical. Whether to use Monte Carlo simulation of the p-value when testing nominal data.

.col.test

A logical. Append a column containing the test each p-value was derived from.

.digits

An integer. The number of digits to round numbers to.

.p.digits

An integer. The number of p-value digits to report.

Value

An object of class tbl_df (tibble) summarizing the provided object.

See also

Examples

# Sample data
df <- data.frame(
  strata = factor(sample(letters[2:3], 1000, replace = TRUE)),
  numeric = sample(1:100, 1000, replace = TRUE),
  numeric2 = sample(1:100, 1000, replace = TRUE),
  factor = factor(sample(1:5, 1000, replace = TRUE)),
  logical = sample(c(TRUE,FALSE), 1000, replace = TRUE)
)

# Summarize all columns
build_table(df, .by = strata)
#> # A tibble: 10 × 5
#>    Variable            Overall      b            c            p       
#>    <chr>               <chr>        <chr>        <chr>        <chr>   
#>  1 "n(%)"              "1000"       "514 (51.4)" "486 (48.6)" ""      
#>  2 "numeric, mean±SD"  "52.6 ±29"   "53.9 ±28.4" "51.4 ±29.5" "0.1698"
#>  3 "numeric2, mean±SD" "49.7 ±28.6" "48.5 ±29"   "50.9 ±28.1" "0.1838"
#>  4 "factor, n(%)"      ""           ""           ""           "0.7716"
#>  5 "  1"               "200 (20)"   "105 (20.4)" "95 (19.5)"  ""      
#>  6 "  2"               "204 (20.4)" "97 (18.9)"  "107 (22)"   ""      
#>  7 "  3"               "220 (22)"   "113 (22)"   "107 (22)"   ""      
#>  8 "  4"               "193 (19.3)" "104 (20.2)" "89 (18.3)"  ""      
#>  9 "  5"               "183 (18.3)" "95 (18.5)"  "88 (18.1)"  ""      
#> 10 "logical, n(%)"     "482 (48.2)" "248 (48.2)" "234 (48.1)" "1.0000"

# Summarize & rename selected columns
build_table(df, numeric2, factor, .by = strata)
#> # A tibble: 8 × 5
#>   Variable            Overall      b            c            p       
#>   <chr>               <chr>        <chr>        <chr>        <chr>   
#> 1 "n(%)"              "1000"       "514 (51.4)" "486 (48.6)" ""      
#> 2 "numeric2, mean±SD" "49.7 ±28.6" "48.5 ±29"   "50.9 ±28.1" "0.1838"
#> 3 "factor, n(%)"      ""           ""           ""           "0.7716"
#> 4 "  1"               "200 (20)"   "105 (20.4)" "95 (19.5)"  ""      
#> 5 "  2"               "204 (20.4)" "97 (18.9)"  "107 (22)"   ""      
#> 6 "  3"               "220 (22)"   "113 (22)"   "107 (22)"   ""      
#> 7 "  4"               "193 (19.3)" "104 (20.2)" "89 (18.3)"  ""      
#> 8 "  5"               "183 (18.3)" "95 (18.5)"  "88 (18.1)"  ""