Adding sample sizes to your ggplot.
Did that reviewer 2 ask you to add sample sizes in your plots? We have all been there; return your figures in style with these simple steps to add sample sizes to your ggplot.
Let’s use the in-built iris
dataset to learn about plotting bars.
If you do not have the libraries installed, you need to install them first by install.packages("package name")
, then load the libraries as previous command.
names(iris) # checking what are the variable names
[1] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width"
[5] "Species"
table(iris$Species) # i precheck any categorical variable using this command
setosa versicolor virginica
50 50 50
iris %>% # your dataset here
ggplot( #we are calling ggplot to draw the plot
aes( #this aesthetics command is necessary to give x and y axes
x = Species,
y = Sepal.Length
)
) +
geom_col()
We need colors!
iris %>%
ggplot(aes(x = Species, y = Sepal.Length,
color = Species)) + #lets color the bars based on Species groups
geom_col()
We need to fill the bars, not color them, so we will correct the command to fill, not color.
iris %>%
ggplot(aes(x = Species, y = Sepal.Length,
fill = Species)) + #this fills the bars
geom_col()
We have just three species which are already well marked in the X-axis, so we actually do not need the legend, lets remove it.
iris %>%
ggplot(aes(x = Species, y = Sepal.Length, fill = Species)) +
geom_col() +
theme(legend.position = "none")
iris %>%
ggplot(aes(x = Species, y = Sepal.Length, fill = Species)) +
geom_col() +
theme(legend.position = "none") +
stat_n_text() #adds sample size
We achieved what we wanted, but can do better. let’s try changing the theme as well as improvise how the sample size is marked.
iris %>%
ggplot(aes(x = Species, y = Sepal.Length, fill = Species)) +
geom_col() +
theme_light() + #i like this theme
theme(legend.position = "none") +
stat_n_text(
y.pos = 20, #we can specify where in y axis the samle size should be denoted
color = "black", #choose any color
text.box = TRUE #draws a box outside the n
)
Wonderful. What if you want the sample sizes to be depicted at different locations for different boxes, then you use the c()
command to give the desired positions.
iris %>%
ggplot(aes(x = Species, y = Sepal.Length, fill = Species)) +
geom_col() +
theme_pomological() +
scale_fill_manual(values=c( "#E87F4D","#286B7B","#E484A9"))+
theme(legend.position = "none") +
stat_n_text(
y.pos = c(270, 315, 345), # 3 positions for 3 bars
color = "black",
text.box = TRUE
)
For attribution, please cite this work as
Soundararajan (2021, Aug. 6). My R Space: Add sample sizes to plots. Retrieved from https://github.com/soundarya24/SoundBlog/posts/2021-05-20-basics-of-bar-plots/
BibTeX citation
@misc{soundararajan2021add, author = {Soundararajan, Soundarya}, title = {My R Space: Add sample sizes to plots}, url = {https://github.com/soundarya24/SoundBlog/posts/2021-05-20-basics-of-bar-plots/}, year = {2021} }