Mon, Jan 10, 2022

2 PM – 5 PM EST (GMT-5)

Add to Calendar

Online Event

49
Registered

Registration

Details

This workshop introduces two modern R packages, both written by Hadley Wickham and part of R’s “tidyverse,” that provide intuitive tools for handling common data management tasks. The first package, tidyr, provides functions that reshape data so it conforms to a specific “tidy” structure where each variable is saved in its own column, each observation is saved in its own row, and each type of observational unit is stored in a separate table. The second package, dplyr, provides a set of functions (referred to as “verbs”) that allow you to easily subset observations, reorder observations, select specific variables, add new variables, group observations, and summarize groups of observations.

Learning objectives:

Participants will walk away with both a general understanding of “tidy” representations of data and practical knowledge of how to leverage it in R.

Knowledge prerequisites:

Participants should have at least basic familiarity with R and RStudio –" this session is not appropriate for people with no prior R experience.

Hardware/software prerequisites:

This session is heavily hands-on. To follow along with the exercises, participants should have both R and RStudio installed on their laptops. Instructions for how to do this can be found at https://researchcomputing.princeton.edu/learn/workshops-live-training/hardware-and-software-requirements-picscie-workshops. Ideally, participants will also have installed the tidyr and dplyr packages in advance. Alternately, participants who prefer to run RStudio remotely on one of Princeton’s systems can do so via the “myadroit” web interface to the Adroit cluster. To do so, you should first register for an account on Adroit, as described in the advance setup guide for PICSciE virtual workshops (https://researchcomputing.princeton.edu/learn/workshops-live-training/hardware-and-software-requirements-picscie-workshops). Then, connect to “myadroit” and start an RStudio session, as described at https://github.com/PrincetonUniversity/hpc_beginning_workshop/tree/master/03_web_interface.

Session format:

Lecture, discussion, and hands-on

What to expect:
Single workshop (one-off workshop –" 2 hours total)

Meet the facilitator:
Dawn Koffman is a Statistical Programmer at the Office of Population Research at Princeton University. She earned an MS in Computer Science from University of Wisconsin-Madison, and an MPH in Epidemiology and Biostatistics from UMDNJ and Rutgers University.

Boriana Pratt is a statistical programmer at the Office of Population Research at Princeton University. She earned an MA in Biostatistics from UC Berkeley, where she also worked at the School of Public Health for a number of years.

To request accommodations for this event, please contact the workshop or event facilitator at least 3 working days prior to the event.

Hosted By

| View More Events
Co-hosted with: PICSciE/Research Computing

Contact the organizers