Hey R, don’t sort my data!

barplots Beginner factors

We learn to override R’s intuitiveness when it is not helpful as it arranges the bars by itself.

Soundarya Soundararajan true
08-12-2021

This is something I stumble often while working with ggplot2. I will elaborate as we go about to construct plots for my blog activity.

Create data

mymonths<- c("Jan", # This is my months column for 2021
             "Feb",
             "Mar", "Apr", "May", "Jun",
             "Jul", "Aug", "Sep", "Oct", "Nov", "Dec")
bloggoal<- c(8,8,8,4,4,4,2,2,2,1,1,1) 
blogposts<- c(2,0,0,0,0,0,0,0,0,0,0,0) #These are the blog posts 
#i have created so far, to compare with my goals

bloggoal are my blog goals every month, note that I want 8 in first quarter and 2 in third quarter, I have already ordered them. blogposts are the one i have posted. I have not contributed 2 this month yet, as this is my first post for this year (and thus this month), but i wanted a whole number to depict comparison plots.

Now we have our variables ready, lets make them a dataframe.

myblogtrack<- data.frame(mymonths, 
                         bloggoal,
                         blogposts)
myblogtrack #displays dataframe
   mymonths bloggoal blogposts
1       Jan        8         2
2       Feb        8         0
3       Mar        8         0
4       Apr        4         0
5       May        4         0
6       Jun        4         0
7       Jul        2         0
8       Aug        2         0
9       Sep        2         0
10      Oct        1         0
11      Nov        1         0
12      Dec        1         0

I am displaying the whole data, if we have a larger data, we will use str(), glimpse(), head() from dplyr.

Let us plot

library(tidyverse) 
ggplot(myblogtrack, #data
       aes(x=mymonths, #aesthetics
           y=bloggoal))+
  geom_col() + # i just need a column to know my goal
  labs(x="Year 2021", # renaming axes
       y="Number of Blog Posts",
       title = "Sound Blog Goals")+
  theme_classic()

oops! This happens when ggplot wants to help us by sorting the x-axis alphabetically. It might be useful elsewhere, but not here. So we will tell ggplot that I have already sorted it in the order I want by informing at the data level.

But, how?

Do not sort

myblogtrack$mymonths<-factor(myblogtrack$mymonths,
                             levels = myblogtrack$mymonths) 

We achieve two things here:

Now we use this dataset,and plot again.

Corrected plot

ggplot(myblogtrack,
       aes(x=mymonths,
           y=bloggoal))+
  geom_col()+
  labs(x="Year 2021", 
       y="Number of Blog Posts",
       title = "Sound Blog Goals")+
  theme_classic()

There you go! The barplots are in the order of months.

Distill is a publication format for scientific and technical writing, native to the web.

Learn more about using Distill at https://rstudio.github.io/distill.

Citation

For attribution, please cite this work as

Soundararajan (2021, Aug. 12). My R Space: Hey R, don't sort my data!. Retrieved from https://github.com/soundarya24/SoundBlog/posts/2021-08-12-hey-r-dont-sort-my-data/

BibTeX citation

@misc{soundararajan2021hey,
  author = {Soundararajan, Soundarya},
  title = {My R Space: Hey R, don't sort my data!},
  url = {https://github.com/soundarya24/SoundBlog/posts/2021-08-12-hey-r-dont-sort-my-data/},
  year = {2021}
}