Summary and Schedule
{% include gh_variables.html %}
This lesson is designed for librarians and library professionals with little or no prior experience with R to be more acquainted with the programming language. Having a level of familiarity with R is beneficial in assisting users with requests regarding the cleaning, formatting, and visualization with data along for librarians and library professionals themselves when it comes to data they intend to use and analyze for their internal workflows.
Learners will become familiar with both R, R Studio software environment, and the Tidyverse. The R Studio environment allows one to run their code and see the immediate results of one’s code separate panels. While R originally started as a being a statistical programming language, R is used for various applications such as data visualization, deploying of web applications, and creating reproducible documentation. Given the extensive applications of R, we will solely be focusing on importing, cleaning, and visualizing data.
By the end of this lesson, learners will be able to:
- Describe what R is and use the basic components of the R Studio software environment.
- Apply functions to import data into R and to format data.
- Employ functions in the
dplyr
package to perform data cleaning and transformation. - Use the
ggplot2
package to create various types of plots and to change aesthetic features of plots.
Setup Instructions | Download files required for the lesson | |
Duration: 00h 00m | 1. Before we Start |
What is R and why learn it? How to find your way around RStudio? How to interact with R? How to install packages? |
Duration: 00h 40m | 2. Introduction to R |
What is an object? What is a function and how can we pass arguments to functions? How can values be initially assigned to variables of different data types? How can a vector be created What are the available data types? How can subsets be extracted from vectors? How does R treat missing values? How can we deal with missing values in R? |
Duration: 02h 00m | 3. Starting with Data |
What is a data.frame? How can I read a complete csv file into R? How can I get basic summary information about my dataset? How can I change the way R treats strings in my dataset? Why would I want strings to be treated differently? How are dates represented in R and how can I change the format? |
Duration: 03h 20m | 4. Data cleaning & transformation with dplyr |
How can I select specific rows and/or columns from a data frame? How can I combine multiple commands into a single command? How can create new columns or remove existing columns from a data frame? How can I reformat a dataframe to meet my needs? |
Duration: 04h 40m | 5. Data Visualisation with ggplot2 |
What are the components of a ggplot? How do I create scatterplots, boxplots, and barplots? How can I change the aesthetics (ex. colour, transparency) of my plot? How can I create multiple plots at once? |
Duration: 06h 35m | Finish |
The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.
Learners must have R and RStudio installed on their computers. They also need to be able to install a number of R packages, create directories, and download files.
To avoid troubleshooting during the lesson, learners should follow the instruction below to download and install everything beforehand. If they are using their own computers this should be no problem, but if the computer is managed by their organization’s IT department they might need help from an IT administrator.
Install R and RStudio
R and RStudio are two separate pieces of software:
- R is a programming language that is especially powerful for data exploration, visualization, and statistical analysis
- RStudio is an integrated development environment (IDE) that makes using R easier. In this course we use RStudio to interact with R.
If you don’t already have R and RStudio installed, follow the instructions for your operating system below. You have to install R before you install RStudio.
Windows
- Download R from the CRAN website.
- Run the
.exe
file that was just downloaded - Go to the RStudio download page
- Under Installers select RStudio x.yy.zzz - Windows Vista/7/8/10 (where x, y, and z represent version numbers)
- Double click the file to install it
- Once it’s installed, open RStudio to make sure it works and you don’t get any error messages.
MacOS
- Download R from the CRAN website.
- Select the
.pkg
file for the latest R version - Double click on the downloaded file to install R
- It is also a good idea to install XQuartz (needed by some packages)
- Go to the RStudio download page
- Under Installers select RStudio x.yy.zzz - Mac OS X 10.6+ (64-bit) (where x, y, and z represent version numbers)
- Double click the file to install RStudio
- Once it’s installed, open RStudio to make sure it works and you don’t get any error messages.
Linux
- Follow the instructions for your distribution from CRAN, they provide
information to get the most recent version of R for common
distributions. For most distributions, you could use your package
manager (e.g., for Debian/Ubuntu run
sudo apt-get install r-base
, and for Fedorasudo yum install R
), but we don’t recommend this approach as the versions provided by this are usually out of date. In any case, make sure you have at least R 3.3.1. - Go to the RStudio download page
- Under Installers select the version that matches your
distribution, and install it with your preferred method (e.g., with
Debian/Ubuntu
sudo dpkg -i rstudio-x.yy.zzz-amd64.deb
at the terminal). - Once it’s installed, open RStudio to make sure it works and you don’t get any error messages.
Update R and RStudio
If you already have R and RStudio installed, check if your R and RStudio are up to date:
- When you open RStudio your R version will be printed in the console
on the bottom left. Alternatively, you can type
sessionInfo()
into the console. If your R version is 4.0.0 or later, you don’t need to update R for this lesson. If your version of R is older than that, download and install the latest version of R from the R project website for Windows, for MacOS, or for Linux - To update RStudio to the latest version, open RStudio and click on
Help > Check for updates
. If a new version is available, quit RStudio, follow the instruction on screen.
Note: It is not necessary to remove old versions of R from your system, but if you wish to do so you can check here.