A beginner-friendly table making in R: Create the demographics table 1 and export to word document
We have to create that table 1 which is the demographics summary for our manuscripts, invariably. There are multiple ways for creating tables in R; I will be briefing one of my favorites: gtsummary
package; for more ways to create customizable tables, consult here.
Let’s work on the iris
dataset. Our aim is to create a summary table of all the 4 variables by Species group. So we use group_by
from dplyr (a part of tidyverse package).
Cut me the chase and take me to the final table.
iris %>%
group_by(Species) %>% # this is the group by which we want to split the data
summarise(mean=mean(Sepal.Length)) #let's start with one variable for trial
# A tibble: 3 × 2
Species mean
<fct> <dbl>
1 setosa 5.01
2 versicolor 5.94
3 virginica 6.59
This is something you would want to try for quick look through. But for publishable quallity table, we will do it with gtsummary
.
iris %>%
select(Species,Sepal.Length) %>% # we select the variables of interest including the group
tbl_summary(by = Species) # a innate function of gtsummary equivalent to group_by
Characteristic | setosa, N = 501 | versicolor, N = 501 | virginica, N = 501 |
---|---|---|---|
Sepal.Length | 5.00 (4.80, 5.20) | 5.90 (5.60, 6.30) | 6.50 (6.23, 6.90) |
1
Median (IQR)
|
We see that by default median and inter-quartile ranges are displayed, if you want the mean and sd, we can have that too.
iris %>%
select(Species,Sepal.Length) %>%
tbl_summary(by = Species,
# what follows might be a bit complicated but a good move to learn
statistic = list(all_continuous() ~ "{mean} ({sd})"))
Characteristic | setosa, N = 501 | versicolor, N = 501 | virginica, N = 501 |
---|---|---|---|
Sepal.Length | 5.01 (0.35) | 5.94 (0.52) | 6.59 (0.64) |
1
Mean (SD)
|
We are asking all the continuous avriables to be displayed by mean and sd.
Can we customize the variable names? Yes, we can.
iris %>%
select(Species,Sepal.Length) %>%
tbl_summary(by = Species,
statistic = list(all_continuous() ~ "{mean} ({sd})"),
label = list(Sepal.Length ~ "Sepal Length"))
Characteristic | setosa, N = 501 | versicolor, N = 501 | virginica, N = 501 |
---|---|---|---|
Sepal Length | 5.01 (0.35) | 5.94 (0.52) | 6.59 (0.64) |
1
Mean (SD)
|
Here the variables are already self-explanatory and I just had to remove the period in between words; note that the actual variable name from the dataset is in the left and the new name is in the right and within quotes.
Lets change the other variable names as well.
iris %>% # your data followed by pipe
select(Species,Sepal.Length, Petal.Length, Sepal.Width, Petal.Width) %>% # variables you want to summarize
tbl_summary(by = Species, #mention by what group by which you need to split
statistic = list(all_continuous() ~ "{mean} ({sd})"),
# we need mean and sd, default is median (IQR)
label = list(Sepal.Length ~ "Sepal Length", #changes variable names
Petal.Length ~ "Petal Length",
Sepal.Width ~ "Sepal Width",
Petal.Width ~ "Petal Width"))
Characteristic | setosa, N = 501 | versicolor, N = 501 | virginica, N = 501 |
---|---|---|---|
Sepal Length | 5.01 (0.35) | 5.94 (0.52) | 6.59 (0.64) |
Petal Length | 1.46 (0.17) | 4.26 (0.47) | 5.55 (0.55) |
Sepal Width | 3.43 (0.38) | 2.77 (0.31) | 2.97 (0.32) |
Petal Width | 0.25 (0.11) | 1.33 (0.20) | 2.03 (0.27) |
1
Mean (SD)
|
What a beautiful output!
We repeat the same code from above, the only difference being we are saving the table in an object.
table1 <- iris %>% # give a name for the table to the left of assign operator (<-)
select(Species,Sepal.Length, Petal.Length, Sepal.Width, Petal.Width) %>%
tbl_summary(by = Species,
statistic = list(all_continuous() ~ "{mean} ({sd})"),
label = list(Sepal.Length ~ "Sepal Length",
Petal.Length ~ "Petal Length",
Sepal.Width ~ "Sepal Width",
Petal.Width ~ "Petal Width"))
table1 %>%
as_flex_table() %>%
save_as_docx(path = "demo.docx")
Note that this command starts with the table you created.
The file gets saved in your working directory. If you have been working inside a project, visit the project folder and you will find the table in docx format which you can customize as needed.
Please comment below on what more requirements and customizations you need for your table.
Read the felicitations for the creator of gtsummary
here and a publication on gtsummary
here.
For attribution, please cite this work as
Soundararajan (2021, Aug. 7). My R Space: Descriptives. Retrieved from https://github.com/soundarya24/SoundBlog/posts/2021-07-23-descriptives-part-1/
BibTeX citation
@misc{soundararajan2021descriptives, author = {Soundararajan, Soundarya}, title = {My R Space: Descriptives}, url = {https://github.com/soundarya24/SoundBlog/posts/2021-07-23-descriptives-part-1/}, year = {2021} }