As a beginner in R, importing data is the first step to master before exploring the data viz and analyses. This blogpost takes a step-by-step approach to import your data into RStudio.
For data analysis in R, as a beginner, the first thing to master is how to import your data into R for analysis. I will go over a step-by-step process of how to import your data into R. If you want me to add information on importing other file types, please leave a comment below.
There are three major ways to accomplish this.
1. Import by clicking on the data
2. Import by selecting File--> Import Dataset
3. By command line
No method is superior over the other, so pick your shot.
It is as simple as it sounds.
You get a nice preview of your data. “Import options” and “code below” are some details which we will avoid for now. Click Import at the right-hand corner below.
The data is imported which you can see in the “Source” pane. Caution: Do not rely on the location in my image, as the 4 panes are customizable and movable. More here.
You can also see that your data is imported by looking at the “Environment” pane; reads 10 obs (observations) of 2 variables). Excellent, now instead of clicking import when we previewed the data, we have another option.
Let’s refer back to the import preview again. Just above the Import option you clicked previously, you see 3 lines of command.
You can copy and paste this, wait do not select and copy, click that small white button on the top right of this command lines while in preview mode. Command lines are copied to your clipboard now. Now cancel this and go to your “Console” pane to paste and run (command+enter).
Either select all and enter command+enter or run line by line in the same order.
These commands basically inform R to
Open the library called “readxl” (because our file is excel)
Second line informs how to read our excel file and what to name it as, unless you change while previewing, it will be imported in the same name as it was stored.
The third command instructs R to open the data for you to “View” (note the capital V)
Click File and choose import dataset
Unlike previous, we see a null data here (no preview also command lines mention null data)
Don’t be scared, this is because we have not chosen our file.
You see that the preview pane is updated now as previous.
You will call in the library(readxl) to open the excel file. And use the read_excel
function.
library(readxl)
data <- read_excel("data.xlsx") #note the direction of the arrowhead
data_new <- read_excel("data.xlsx", sheet = 2) #You can specify sheets
Sometimes we might want to import files types other than excel, with an academic perspective I can imagine one wanting to import SPSS or Stata files.
I have used haven
package to import other files with calling the corresponding function. Also I have introduced the commands after #(meaning they dont run and are considered as comments) because I do not have any spss/stata file to demonstrate. When you use this function you have to remove the hashtags and run the command line. To run a command, write or copy the command to console or R script, keep the cursor in the line and press cntrl+enter.
I personally use csv files, if you do too, here is the way to import them.
In due course, you will find it much easier to write your command lines by yourself without going through any of these. But to begin with, this is essential.
Some questions to ponder
Please feel free to comment on what other file types you use and I will update this blog as needed.
Until then, happy importing!!
For attribution, please cite this work as
Soundararajan (2021, Aug. 1). My R Space: Importing data into RStudio - a step-by-step approach. Retrieved from https://github.com/soundarya24/SoundBlog/posts/2021-05-13-how-to-import-your-data-into-r/
BibTeX citation
@misc{soundararajan2021importing, author = {Soundararajan, Soundarya}, title = {My R Space: Importing data into RStudio - a step-by-step approach}, url = {https://github.com/soundarya24/SoundBlog/posts/2021-05-13-how-to-import-your-data-into-r/}, year = {2021} }