We create a canvas today to plot tomorrow.
Welcome to Day 2 of “Viz with Me” in R!
On day 0, we installed R and RStudio.
On day 1, we installed the necessary packages and loaded the penguins data.
Goals for today: Learning to create a canvas in R
Let’s get started!
Imagine you’ve collected some data, and I ask you to find the relationship between two variables, X and Y, in your dataset. What would you do? You might run a correlation or plot a simple scatterplot, with X on the x-axis and Y on the y-axis.
To do a similar plot in R, there are three components we need to break down:
Data: We introduced you to the dataset yesterday. Do you remember its name? Yes, it’s the penguins dataset from the Palmer Penguins package.
Defining X and Y: We need to inform R what the X and Y variables are. These are specified under the aes() function in ggplot.
tidyverse
package? It’s a collection of packages, including ggplot2
, so you’re good to go! Yesterday, we talked about packages and functions. The aes()
function is where we’ll specify the X and Y variables.Here’s a glimpse of how it will look:
ggplot(aes(x = , y = ))
We first use ggplot()
, then the aes()
function to introduce X and Y. This is the standard approach. We have not specified what is X and Y yet. We will do that in the next step.
If this code would speak it would say: “Hey R, take the ggplot package and use the aesthetics function to plot X and Y.”
geoms
, which we’ll explore tomorrow.For today, we’ll focus on the first two steps.
Let’s revisit the script from yesterday.
Run the following code to load the penguins dataset and to awaken the tidyverse
package.:
But where is the data? We need to extract X and Y from the penguins dataset. Let’s assume x is bill length and y is body mass from the penguins dataset. We’re looking to find the relationship between them.
If you remember from yesterday, you’d access a variable like this:
penguins$bill_length_mm
That’s how you access the bill_length_mm variable, but it’s much simpler within ggplot. First, we call the data, like this:
penguins
# A tibble: 344 × 8
species island bill_length_mm bill_depth_mm flipper_length_mm
<fct> <fct> <dbl> <dbl> <int>
1 Adelie Torgersen 39.1 18.7 181
2 Adelie Torgersen 39.5 17.4 186
3 Adelie Torgersen 40.3 18 195
4 Adelie Torgersen NA NA NA
5 Adelie Torgersen 36.7 19.3 193
6 Adelie Torgersen 39.3 20.6 190
7 Adelie Torgersen 38.9 17.8 181
8 Adelie Torgersen 39.2 19.6 195
9 Adelie Torgersen 34.1 18.1 193
10 Adelie Torgersen 42 20.2 190
# ℹ 334 more rows
# ℹ 3 more variables: body_mass_g <int>, sex <fct>, year <int>
Next, we inform ggplot what the variables are. Here’s how:
What happens when you run this code?
Why Doesn’t It Plot Yet? You might notice that nothing happens yet. Do you know why? The data and the plot are not yet connected because we haven’t told them to “talk” to each other. Remember the pipe (%>%) we discussed yesterday? That’s where it comes in handy!
You know how to comment out the code, right? Use the
#
symbol before the code to comment it out. Whatever you learn now, if you want to make notes, do it as a comment in your script.
Let us use the pipe operator to connect the data and the plot. Here’s how:
What does this code say? It says, “Hey R, take the penguins dataset, and then take the ggplot package and use the aesthetics function and here are the X and Y variables.”
Now, run this – your canvas is ready!
If you don’t see the plot yet, don’t worry – we’ll “draw” the scatterplot tomorrow.
Until then, ggplot
would eagerly wait for us to inform it about the plot we want to draw.
Don’t forget to save your script.
Happy holidays, and happy plotting!
Jump ahead to Day 3 to draw the scatterplot.
For attribution, please cite this work as
Soundararajan (2024, Oct. 2). My R Space: Day 2 of Viz with me. Retrieved from https://github.com/soundarya24/SoundBlog/posts/2024-10-02-day-2-of-viz-with-me/
BibTeX citation
@misc{soundararajan2024day, author = {Soundararajan, Soundarya}, title = {My R Space: Day 2 of Viz with me}, url = {https://github.com/soundarya24/SoundBlog/posts/2024-10-02-day-2-of-viz-with-me/}, year = {2024} }