Learn to count categorical data.
Welcome to Day 9 of “Viz with Me”!
Yesterday, we learnt how to create a line plot.
Goals for today: To take our first step towards creating bar plots in R!
For this, we will return to the trusty penguins
dataset. First, make sure to load the necessary libraries:
We will draw bars to represent the number of species of penguins. However, if you check the dataset, this information is not readily available in a summarized format.
Let’s take a quick look at the data using glimpse:
glimpse(penguins)
Rows: 344
Columns: 8
$ species <fct> Adelie, Adelie, Adelie, Adelie, Adelie, Ad…
$ island <fct> Torgersen, Torgersen, Torgersen, Torgersen…
$ bill_length_mm <dbl> 39.1, 39.5, 40.3, NA, 36.7, 39.3, 38.9, 39…
$ bill_depth_mm <dbl> 18.7, 17.4, 18.0, NA, 19.3, 20.6, 17.8, 19…
$ flipper_length_mm <int> 181, 186, 195, NA, 193, 190, 181, 195, 193…
$ body_mass_g <int> 3750, 3800, 3250, NA, 3450, 3650, 3625, 46…
$ sex <fct> male, female, female, NA, female, male, fe…
$ year <int> 2007, 2007, 2007, 2007, 2007, 2007, 2007, …
Now, let’s count the number of penguins for each species. This will give us a clearer idea of how many penguins we have for each species category before creating the bar plot:
# A tibble: 3 × 2
species n
<fct> <int>
1 Adelie 152
2 Chinstrap 68
3 Gentoo 124
count()
function gives us the counts of the categorical variable we specify, which in this case is species. The dataset has three species of penguins, and the count of each will be displayed. count
is part of the same dplyr package (Wickham et al. 2023) as filter
and glimpse
.
Once we have the counts, all that is left is to construct the bar plot, which we’ll cover tomorrow.
Get familiar with the output of this code, as we’ll use that information to pipe directly into our bar plot tomorrow. The output contains two columns: species, which is our variable of interest, and n, which gives the number of penguins for each species. The labels
Can you guess which geom_
function we’ll use to create the bars?
Jump to tomorrow, for the bar plots!
For attribution, please cite this work as
Soundararajan (2024, Oct. 9). My R Space: Day 9 of viz with me. Retrieved from https://github.com/soundarya24/SoundBlog/posts/2024-10-09-day-9-of-viz-with-me/
BibTeX citation
@misc{soundararajan2024day, author = {Soundararajan, Soundarya}, title = {My R Space: Day 9 of viz with me}, url = {https://github.com/soundarya24/SoundBlog/posts/2024-10-09-day-9-of-viz-with-me/}, year = {2024} }