Automating data cleaning and analysis using R.

  1. Researchers are producing more data than ever before.
  2. It's literally impossible to analyze all of these data by hand.
  3. Research is really repetitive
  4. Automating data processing and analysis will streamline your research

The Coding and Cookies series will teach you the basics of how to use R programming and version control using git to make your research more efficient and reproducible. 

Starting in Fall 2020, these workshops have been adapted to support online learning. The new format for the workshops will be a flipped classroom approach. Learners will watch a video prior to a session and an online, live workshop will be used to review key concepts and work through additional problems. Learning materials will continue to be made publicly available on this guide (see links to the left). If sessions are full, interested students are encouraged to watch the videos and get in touch with the instructors for follow-up questions. 

Sessions will be led by experienced statistics graduate students and facilitated by Mara Sedlins, PhD, Data Management Specialist at the CSU Libraries, and Julia Sharp, Associate Professor of Statistics and Director of the Graybill Statistics and Data Science Laboratory.  

Spring 2021 Workshops and Schedule

New to R or RStudio? We encourage you to attend the first session, R Basics. A basic working knowledge of R and RStudio is helpful to get the most out of the rest of the sessions.

R Basics

Learning how to code involves an investment of time and effort up front, but will save you time and effort in the long run. In the R basics Coding and Cookies session, the basics of using tabular data in RStudio will be discussed. By the end of this session, you will be able to load data into R, calculate summary statistics, and create exploratory graphs using R’s basic graphics package. This session is geared toward beginners, so if you have experience using R, this may not be the class for you.

February 2nd, 10:00-11:30am
February 23rd, 10:00-11:30am
Tidy Data in R

The process of generating data can be messy, and what you can do with your data depends strongly on how it is formatted. This month's coding and cookies will cover the definition of “tidy data”, a standardized way of formatting your data that makes it easier to work with. You will learn how to clean and reformat your data using a collection of R packages called the tidyverse. A basic working knowledge of R and R studio would be helpful for you to get the most out of this session.

March 9th, 10:00-11:30am
Data Visualization using ggplot2

So you’re familiar with R, but want to do more with your plots than the base graphics package.  In this month’s Coding and Cookies, the ggplot2 package in R will be discussed. After this session, you will be able to create a variety of plot types, alter their aesthetics, and create custom themes. A working knowledge of R and R studio and dplyr would be helpful for you to get the most out of this session.

March 30th, 10:00-11:30am
Version Control using Git

We’ve all intuitively used some type of version control in our work such as saving multiple versions of a document. While easy, it can cause file bloat and ultimately become more complicated. Luckily, formal version control systems have been developed to streamline this process. In this month’s Coding and Cookies session, we will be covering version control using git. After this session, you’ll be able to create a git repository, make and add changes to the repository, and use GitHub to remotely store your repository.

April 20th, 10:00-11:30am
