Adding sample sizes to your ggplot.
Did that reviewer 2 ask you to add sample sizes in your plots? We have all been there; return your figures in style with these simple steps to add sample sizes to your ggplot.
Let’s use the in-built iris dataset to learn about plotting bars.
If you do not have the libraries installed, you need to install them first by install.packages("package name"), then load the libraries as previous command.
names(iris) # checking what are the variable names
[1] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width"
[5] "Species"
table(iris$Species) # i precheck any categorical variable using this command
setosa versicolor virginica
50 50 50
iris %>% # your dataset here
ggplot( #we are calling ggplot to draw the plot
aes( #this aesthetics command is necessary to give x and y axes
x = Species,
y = Sepal.Length
)
) +
geom_col()

We need colors!
iris %>%
ggplot(aes(x = Species, y = Sepal.Length,
color = Species)) + #lets color the bars based on Species groups
geom_col()

We need to fill the bars, not color them, so we will correct the command to fill, not color.
iris %>%
ggplot(aes(x = Species, y = Sepal.Length,
fill = Species)) + #this fills the bars
geom_col()

We have just three species which are already well marked in the X-axis, so we actually do not need the legend, lets remove it.
iris %>%
ggplot(aes(x = Species, y = Sepal.Length, fill = Species)) +
geom_col() +
theme(legend.position = "none")

iris %>%
ggplot(aes(x = Species, y = Sepal.Length, fill = Species)) +
geom_col() +
theme(legend.position = "none") +
stat_n_text() #adds sample size

We achieved what we wanted, but can do better. let’s try changing the theme as well as improvise how the sample size is marked.
iris %>%
ggplot(aes(x = Species, y = Sepal.Length, fill = Species)) +
geom_col() +
theme_light() + #i like this theme
theme(legend.position = "none") +
stat_n_text(
y.pos = 20, #we can specify where in y axis the samle size should be denoted
color = "black", #choose any color
text.box = TRUE #draws a box outside the n
)

Wonderful. What if you want the sample sizes to be depicted at different locations for different boxes, then you use the c() command to give the desired positions.
iris %>%
ggplot(aes(x = Species, y = Sepal.Length, fill = Species)) +
geom_col() +
theme_pomological() +
scale_fill_manual(values=c( "#E87F4D","#286B7B","#E484A9"))+
theme(legend.position = "none") +
stat_n_text(
y.pos = c(270, 315, 345), # 3 positions for 3 bars
color = "black",
text.box = TRUE
)

For attribution, please cite this work as
Soundararajan (2021, Aug. 6). My R Space: Add sample sizes to plots. Retrieved from https://github.com/soundarya24/SoundBlog/posts/2021-05-20-basics-of-bar-plots/
BibTeX citation
@misc{soundararajan2021add,
author = {Soundararajan, Soundarya},
title = {My R Space: Add sample sizes to plots},
url = {https://github.com/soundarya24/SoundBlog/posts/2021-05-20-basics-of-bar-plots/},
year = {2021}
}