This is my first post in this blog, and I start with my favourite programming language — R.
I am not saying that R is the best programming language. On the contrary, I think Python is much easier to learn, more efficient and powerful. But R is probably the best for what I do, and it makes my work so much easier.
I am a research scholar working in the field of behavioural ecology. For the data I collect and work with, I personally think R is the easiest and most useful to work with. R is developed for statistical analysis, and has packages and data structures built for that purpose, along with a superb online community.
But enough about my love for R. I just wanted to get this out of my chest, and also to warn readers that there will be several posts on this blog about R.
Let’s get straight to the point. The part from the previous paragraphs, which is useful for this post, is that R has great data structures for data analysis. One of the most useful ones, and also the most used by me, is called data frame. If you have worked with spreadsheet programs, like Microsoft Office Excel, you are already familiar with data frames.
In a data frame, the different categories or variables are represented using columns, and the rows represent replicates of the data. This is not a strict rule, but a format that is generally followed and is useful for R. Look at the image below for an example.
It is really easy to work with data frames in R, and I will recommend anyone learning R to get familiar with this data structure.
Most of us also use spreadsheets to collect and record data, and import the spreadsheets into R for analysis. The most recommended method is to convert the spreadsheet file into a format such as comma-separated or comma-delimited values (csv) — as it is only a text file and easier to work with across different softwares and operating systems. R functions such as
read.table() is used to import the csv file. One can also import MS Office Excel files directly into R using packages such as
xlsx or functions such as
readXL() from the package
The options described above are the recommended methods to import data, especially when the same data will be used across several analyses or will be used repeatedly, or when using big datasets. But there is also a shortcut method of importing data from spreadsheets, which is this post about!
Whew! finally getting to the point! This trick takes into account the clipboard feature of Microsoft Windows operating system.
Big disclaimer by the way! I think this method only works in Windows. I have never used Mac OS, therefore I do not know whether it will work or not, but it certainly does not work in the Linux distributions I have used.
Clipboard is a feature in Windows, which saves the last bit of information that you copy into memory. The shortcut, which I am describing, is the following:
- Select the data to be imported in the spreadsheet
- Copy the selected data by pressing Ctrl+C or right-clicking and selecting “Copy”. This will save the data in the clipboard.
- Open R, and go to the console (As I have said before, this trick is ideal for when you are doing a quick analysis or statistical test; thus I assume one would not be using R script files or R notebooks for this method!)
- write the following code to import the data from clipboard (Use whatever variable name you want to use instead of ‘data’):
data <- read.table(“clipboard”, header = TRUE, sep = “\t”)
header = TRUE only if you have headings, i.e., column names in your data. the
sep = “\t” argument specifies that the data is from a spreadsheet, as the columns are separated using tabs in a spreadsheet (In case of a csv files, the columns are separated by commas).
Below is a video tutorial I made on the same:
Hope this is helpful. And remember, if you are using really big datasets, or are going to use the same data in different analyses or repeatedly, always put the data in csv files or import directly from spreadsheets.