Learning R: 2019 Summer Workshop for NYU CDSC Lab

Meeting time: Tuesdays from 1:30 to 3 pm (beginning June 11) | Instructor: Kelsey Moty

This workshop will help you to learn the fundamentals of R needed to manipulate, visualize, and describe your data. This workshop has a particular emphasis on producing clean and reproducible code in line with coding and open science best practices. Due to time limitations, we will not be able to go over how to do statistical modeling in R; however, I will provide a series of resources at the end of this list that you can look over on your own.

Some of these resources may at times be redundant with one another. Feel free to skip over material that you feel comfortable with. Most importantly, make sure to work through the exercises! The best way to learn how to code is by actually coding :)

Each week, we will meet for an hour and half to go over the topic for that week. This meeting is meant to be collaborative. You will work together with other people in our lab to get started with each week's R skill. However, you will also need to complete some of the lesson on your own time, as an hour and half is likely not enough time to practice that week's topic. If questions come up outside of our Tuesday meeting, feel free to post them on the R Workshop Slack channel!

You will get to apply the skills learned in this workshop to a dataset from a research project you are currently working on in the lab. At the end of the workshop, you will share with other members of the lab the dataset you cleaned up, a plot you created from that dataset, and some kind of analysis you did on that dataset (whether descriptive or inferential).

Before we begin, this workshop pulls from resources written by a lot of amazing people and they deserve credit for it!

A number of the book chapters and other resources we are reading were written by Hadley Wickham, Danielle Navarro, Jenny Bryan, Jim Hester, Kieran Healy, and Andy Fields. Several of the tutorials we are working through are from a course that was taught by Dale Barr and Lisa DeBruine.

Getting your data ready for statistical analysis

Downloading R:

Downloading RStudio:

Reading + exercises:

Learning basics about R (Part 1)

Reading + exercises:

Learning basics about R (Part 2)

Reading:

More about packages

Reading:

More about variables

Reading:

More about vectors

Resource:

Cheat sheet on how to use RStudio

Resource:

Cheat sheet on basic R functions

Reading:

What makes a good plot?

Notes + exercises:

Making plots

Reading + exercises:

Getting a better understanding of the code used to make plots (Chapter 3, especially 3.3 - 3.10)

Resource:

Examples of plots with corresponding R code

Resource:

Resource for helping you select the best way to visualize your data

Resource:

R Graphics Cookbook:

Resource:

Cheat sheet on data visualization

Reading:

Tidy Data

Reading:

Using pipes to tidy data

Notes + exercises:

Learning tidyr

Reading (optional):

Manipulating your data using tidyr

Resource:

Cheat sheet on importing and tidying data

Resource:

Cheat sheet on processing dates using lubridate

Reading:

Describing data

Reading + exercises:

Data transformation

Notes + exercises:

Learning the main 6 dplyr verbs

Resource:

Cheat sheet on data transformation with dplyr

Reading + exercises:

Relational data

Notes + exercises:

Joining data using the dplyr's join verbs

Resource:

Cheat sheet on data transformation with dplyr

Reading + exercises:

Using loops in R

Reading + exercises:

Using branches in R

Reading + exercises:

Creating your own functions in R

Notes + exercises:

Iterating and more practice creating your own functions in R

Reading (optional):

More about loops and iterating in R

Reading (optional):

More about writing your own functions in R

Reading:

Using R Markdown

Notes + exercises:

Creating reproducible code in R

Reading:

How to properly set paths

Slides:

How to name files

Reading:

How to debug your R code

Resource:

Cheat sheet on R Markdown

Reading:

Why GitHub?

Reading + exercises:

Installing Git (Read Chapters 4 - 7; 8 is optional)

Reading + exercises:

Connecting GitHub and RStudio (Chapters 9 & 12; 14 is helpful if you are having problems connecting!)

Reading + exercises:

Using GitHub to store R code (Read through Chapter 15; 16 and 17 are for your reference for future projects)

Reading + exercises:

Basics of Git (Chapter 20; Chapter 21 - 23 for more advanced stuff)

Moving forward: Other things you should learn about R:

Reading:

Learning statistics with R: A tutorial for psychology students

Why do we learn statistics?

Introduction to research design

Descriptive statistics

Statistical theory:

Introduction to probability,

Estimating unknown quantities from a sample,

Hypothesis testing

Categorical data analysis,

Comparing two means,

Comparing several means,

Linear regression,

Factorial ANOVA

Bayesian statistics

Book:

Discovering statistics with R

Learning statistics with R

Exercises:

Interactive tutorials for learning how to do statistics in R

An Adventure in Statistics

Frequentist approaches:

Reading + exercises:

Mixed models in R:

Exercises:

Interactive tutorials on statistics in R

Reading:

Psychometrics in R

Reading:

Exploring interactions using `interactions` package:

Bayesian approaches:

Book:

Bayesian statistics in R

solutions for the book's exercises

Reading:

Style Guide

General syntax style guide

Style guide for using pipes

Improve your programming skills and gain a deep understanding of the R language:

Book:

Advanced R

Manipulating strings and pattern matching in R using regular expressions:

Reading + exercises:

Book chapter on manipulating strings

Reading + video tutorial:

Base R functions for doing regular expressions

Reading:

Using stringr (a tidyverse package) for regular expressions

Resource:

Cheatsheet for basic regular expressions in R

data.table: An alternative approach for wrangling data

Reading:

Introduction to data.table:

When pre-registering your study, one best practice is to also pre-register all the R code you will use for your analyses. How do you write code without data? One way: simulate a dataset and use that data as you work through your analyses.

Reading:

Getting started simulating data in R

Reading + exercises:

Lab to practice simulating data using R

Slides + exercises:

Slides on simulating data using R

Book:

Introduction to Scientific Programming and Simulation Using R

General Resources

Cheat sheets

Various cheat sheets on a range of topics, from dplyr, ggplot, R Markdown, and more!

Books / Tutorials

psyTeachR: Great resource that provides a number of interactive books and tutorials for doing reproducible research in R. This website covers a broad range of topics on data cleaning, visualization, reproducible workflows, and more. From their website: "Our curriculum now emphasizes essential ‘data science’ graduate skills that have been overlooked in traditional approaches to teaching, including programming skills, data visualisation, data wrangling and reproducible reports. Students learn about probability and inference through data simulation as well as by working with real datasets.""