Dot and boxplots

ggplot Beginner dot plots boxplots

Let us see how to draw boxplots with individual data points depicted on them.

Soundarya Soundararajan true
08-09-2021

I choose to work on iris and mtcars dataset for the ease of accessing them. Today we will work on mtcars again to depict the differences in mileage between automatic and manual transmission types. To check the distribution of these variables, consult the density plots discussed in an earlier post here.

There are three ways to achieve this a combination of dot and box plots.

  1. Base R method
  2. ggplot method
  3. ggpubr method

Let us initiate the libraries and create a new dataframe to work with.

library(tidyverse)
library(ggpubr) #to depict group differences

Cut me the chase and take me to the final plot from ggplot.

To make am a factor variable, we recode and assign in it in a dataframe df.

df <- mtcars %>%
  mutate(
    am = # am variable should be mutated (changed) to..
    factor(am, # a factor variable..
      levels = c(0, 1), # which has these levels
      labels = c("Automatic", "Manual") # add corresponding names to the levels
    )
  )

1. Base R way

Box and whisker

boxplot(data=df, # note we tell R what the dataframe is
        mpg~am) # note the formula

The formula here is, y variable is denoted on the left hand side and the group on the right.

Alright, lets add the dots.

Add the stripchart (dotplot) to the boxplot

boxplot(data=df, mpg~am, xlab = "Transmission type",ylab = "Mileage")
# follow the boxplot with this stripchart
stripchart(data=df, mpg~am, 
           vertical=TRUE, # otherwise the dots will be horizontal
           method="jitter", # to avoid overlapping
           pch=16, #gives dark full dots
           add=TRUE # this lets the 
           )
We see the box-plot and dot plots together

2. ggplot way

Initiate the box plots

df %>% 
  ggplot(aes(x=am, y=mpg))+
  geom_boxplot()

Add colors and change box width

df %>% 
  ggplot(aes(x=am, y=mpg, fill=am))+
  geom_boxplot(width=0.3, alpha=0.4)

Adding some custom colors

df %>% 
  ggplot(aes(x=am, y=mpg, fill=am))+
  geom_boxplot(width=0.3,alpha=0.8)+
  scale_fill_manual(values=c("#639EAA", "#8D2D54"))

Add individual data points

df %>% 
  ggplot(aes(x=am, y=mpg, fill=am))+
  geom_boxplot(width=0.3, alpha=0.8)+
  geom_point(alpha=0.6)+
  scale_fill_manual(values=c("#639EAA", "#8D2D54"))

The individual points can be overlapping, to avoid this, we use jitter function.

Jitter the points to avoid overlapping

df %>% 
  ggplot(aes(x=am, y=mpg, fill=am))+
  geom_boxplot(width=0.3, alpha=0.8)+
  geom_jitter(alpha=0.6)+
  scale_fill_manual(values=c("#639EAA", "#8D2D54"))

oooh.., too much jitter. Lets tame them.

Control the jitter

df %>% 
  ggplot(aes(x=am, y=mpg, fill=am))+
  geom_boxplot(width=0.3, alpha=0.8)+
  geom_jitter(width = 0.03, height = 0.03, alpha=0.4)+
  scale_fill_manual(values=c("#639EAA", "#8D2D54"))

Changing theme and adding labels

df %>% 
  ggplot(aes(x=am, y=mpg, fill=am))+
  geom_boxplot(width=0.3, alpha=0.8)+
  geom_jitter(width = 0.03, height = 0.03, alpha=0.4)+
  scale_fill_manual(values=c("#639EAA", "#8D2D54"))+
  theme_classic()+
  theme(legend.position = "none")+
  labs(x="Transmission type", 
       y="Miles/(US) gallon")

Add group differences

df %>% 
  ggplot(aes(x=am, y=mpg, fill=am))+
  geom_boxplot(width=0.3, alpha=0.8)+
  geom_jitter(width = 0.03, height = 0.03, alpha=0.4)+
  scale_fill_manual(values=c("#639EAA", "#8D2D54"))+
  stat_compare_means()+ # make sure ggpubr is loaded for this to work
  theme_classic()+
  theme(legend.position = "none")+
  labs(x="Transmission type", 
       y="Miles/(US) gallon")

By default, the test with the p value is depicted. We can modify this.

Final plot

df %>% 
  ggplot(aes(x=am, y=mpg, fill=am))+
  geom_boxplot(width=0.3, alpha=0.8)+
  geom_jitter(width = 0.03, height = 0.03, alpha=0.4)+
  scale_fill_manual(values=c("#639EAA", "#8D2D54"))+
  stat_compare_means(label = "p.signif",label.x = 1.5, label.y = 25)+
  theme_classic()+
  theme(legend.position = "none")+
  labs(x="Transmission type", 
       y="Miles/(US) gallon")

Export as TIFF/PDF/PNG

From the export option

Save the file where you want by point and click interface.

Click on export under plots

Write codes for export

Or you can save them by R way by writing a code. If you have saved the plot as object then save the object, if not just as you finish making the plot, save your last plot.

ggsave(filename = "demo.tiff", 
       plot = last_plot() # make sure your last plot is the one you wamt to save
       )

3. ggpubr way

This is an easier approach, but differs slightly from the ggplot way. Consult this page from the sthda.com for a detailed walkthrough.

Try all the three ways and my suggestion is to stick to what resonates with you.

Good luck!

Citation

For attribution, please cite this work as

Soundararajan (2021, Aug. 9). My R Space: Dot and boxplots. Retrieved from https://github.com/soundarya24/SoundBlog/posts/2021-08-09-dot-and-boxplots/

BibTeX citation

@misc{soundararajan2021dot,
  author = {Soundararajan, Soundarya},
  title = {My R Space: Dot and boxplots},
  url = {https://github.com/soundarya24/SoundBlog/posts/2021-08-09-dot-and-boxplots/},
  year = {2021}
}