My R Space: Descriptives

Soundarya Soundararajan

We have to create that table 1 which is the demographics summary for our manuscripts, invariably. There are multiple ways for creating tables in R; I will be briefing one of my favorites: gtsummary package; for more ways to create customizable tables, consult here.

Libraries

library(tidyverse) #for many many things
library(gtsummary) #for making tables

Let’s work on the iris dataset. Our aim is to create a summary table of all the 4 variables by Species group. So we use group_by from dplyr (a part of tidyverse package).

Cut me the chase and take me to the final table.

Quick summary

iris %>% 
  group_by(Species) %>% # this is the group by which we want to split the data
  summarise(mean=mean(Sepal.Length)) #let's start with one variable for trial

# A tibble: 3 × 2
  Species     mean
  <fct>      <dbl>
1 setosa      5.01
2 versicolor  5.94
3 virginica   6.59

This is something you would want to try for quick look through. But for publishable quallity table, we will do it with gtsummary.

Initiate the table

iris %>% 
  select(Species,Sepal.Length) %>% # we select the variables of interest including the group
  tbl_summary(by = Species) # a innate function of gtsummary equivalent to group_by

Characteristic	setosa, N = 50¹	versicolor, N = 50¹	virginica, N = 50¹
Sepal.Length	5.00 (4.80, 5.20)	5.90 (5.60, 6.30)	6.50 (6.23, 6.90)
¹ Median (IQR)

We see that by default median and inter-quartile ranges are displayed, if you want the mean and sd, we can have that too.

Change the statistics

iris %>% 
  select(Species,Sepal.Length) %>% 
  tbl_summary(by = Species,
              # what follows might be a bit complicated but a good move to learn
              statistic = list(all_continuous() ~ "{mean} ({sd})"))

Characteristic	setosa, N = 50¹	versicolor, N = 50¹	virginica, N = 50¹
Sepal.Length	5.01 (0.35)	5.94 (0.52)	6.59 (0.64)
¹ Mean (SD)

We are asking all the continuous avriables to be displayed by mean and sd.

Can we customize the variable names? Yes, we can.

Change variable names

iris %>% 
  select(Species,Sepal.Length) %>% 
  tbl_summary(by = Species,
              statistic = list(all_continuous() ~ "{mean} ({sd})"),
              label = list(Sepal.Length ~ "Sepal Length"))

Characteristic	setosa, N = 50¹	versicolor, N = 50¹	virginica, N = 50¹
Sepal Length	5.01 (0.35)	5.94 (0.52)	6.59 (0.64)
¹ Mean (SD)

Here the variables are already self-explanatory and I just had to remove the period in between words; note that the actual variable name from the dataset is in the left and the new name is in the right and within quotes.

Lets change the other variable names as well.

Final table

iris %>% # your data followed by pipe
  select(Species,Sepal.Length, Petal.Length, Sepal.Width, Petal.Width) %>% # variables you want to summarize 
  tbl_summary(by = Species, #mention by what group by which you need to split
              statistic = list(all_continuous() ~ "{mean} ({sd})"), 
              # we need mean and sd, default is median (IQR)
              label = list(Sepal.Length ~ "Sepal Length", #changes variable names
                           Petal.Length ~ "Petal Length",
                           Sepal.Width ~ "Sepal Width",
                           Petal.Width ~ "Petal Width"))

Characteristic	setosa, N = 50¹	versicolor, N = 50¹	virginica, N = 50¹
Sepal Length	5.01 (0.35)	5.94 (0.52)	6.59 (0.64)
Petal Length	1.46 (0.17)	4.26 (0.47)	5.55 (0.55)
Sepal Width	3.43 (0.38)	2.77 (0.31)	2.97 (0.32)
Petal Width	0.25 (0.11)	1.33 (0.20)	2.03 (0.27)
¹ Mean (SD)

What a beautiful output!

Exporting tables from gtsummary

Saving the table in R

We repeat the same code from above, the only difference being we are saving the table in an object.

table1 <- iris %>% # give a name for the table to the left of assign operator (<-)
  select(Species,Sepal.Length, Petal.Length, Sepal.Width, Petal.Width) %>% 
  tbl_summary(by = Species,
              statistic = list(all_continuous() ~ "{mean} ({sd})"),
              label = list(Sepal.Length ~ "Sepal Length",
                           Petal.Length ~ "Petal Length",
                           Sepal.Width ~ "Sepal Width",
                           Petal.Width ~ "Petal Width"))

The object table1 is the one we have saved the table and you can give any name you want. Remember to call the same in the next step.

Exporting the table as docx

table1 %>% 
  as_flex_table() %>% 
  save_as_docx(path = "demo.docx")

Note that this command starts with the table you created.

The file gets saved in your working directory. If you have been working inside a project, visit the project folder and you will find the table in docx format which you can customize as needed.

Please comment below on what more requirements and customizations you need for your table.

Read the felicitations for the creator of gtsummary here and a publication on gtsummary here.

Comment on this article