Tue, Oct 11, 2022

4:30 PM – 6 PM EDT (GMT-4)

Add to Calendar

Private Location (sign in to display)

View Map
47
Registered

Registration

Details

This workshop provides an introduction to effective data visualization in Python. The training focuses on three plotting packages: Matplotlib, Seaborn and Plotly. Examples may include simple static 1D plots, 2D contour maps, heat maps, violin plots, and box plots. The session may also touch on more advanced interactive plots.

Learning objectives: Attendees will be exposed to different plotting packages in Python, along with how to integrate them with NumPy and Pandas, at least at a basic level. After the session, participants will know the basic mechanics of how to generate publication-quality plots using Python.

Knowledge prerequisites: Participants should have reasonable facility with the Python programming language, including a basic familiarity with NumPy arrays and Pandas data frames. No previous experience with Python plotting tools is required.

Hardware/software prerequisites: Participants have two options: (1) Come with your own installation of Anaconda Python 3 distribution on your laptop. This will provide Jupyter notebooks, NumPy, Pandas and Matplotlib. (2) Create an account on Adroit at least 48 hours before the workshop (https://forms.rc.princeton.edu/registration/?q=adroit) and use the MyAdroit web interface for the workshop. Directions for using the MyAdroit interface will be provided at the workshop.

Session format: Presentation, demo, and hands-on

Instructor bios: Brian was born and raised in Minnesota, where he attended the University of Minnesota --Twin Cities and earned a degree in plant biology. Fascinated by the concept of using genomic data to understand evolution, Brian continued studying plants during his PhD at Harvard University and later studied bacterial genomics during his postdoc at the Harvard T.H. Chan School of Public Health. Afterwards, he worked as a Senior Bioinformatics Scientist at Harvard where he continued working on genomics and taught introductory data science workshops. Brian joined Princeton University in 2020 as Schmidt DataX fellow where he works on biomedical cloud computing with large data sets.

Michal joined Princeton Research Computing in 2021 after five years working as a Research Software Engineer at Oregon Health & Science University, where his primary project involved studying the application of machine learning models to better understand the impacts of mutations commonly implicated in tumorigenesis. This involved implementing novel methods for representing the taxonomies of mutations present in cancer cohorts, as well as developing software for deploying and consolidating thousands of classification models on a high-performance compute cluster. His present work focuses on optimizing pipelines for generating quantitative assessments of the contributions various types of assets can make to a power grid’s ability to satisfy the demand for electricity over a given time frame.

Speakers

Brian Arnold's profile photo

Brian Arnold

Brian was born and raised in Minnesota, where he attended the University of Minnesota --Twin Cities and earned a degree in plant biology. Fascinated by the concept of using genomic data to understand evolution, Brian continued studying plants during his PhD at Harvard University and later studied bacterial genomics during his postdoc at the Harvard T.H. Chan School of Public Health. Afterwards, he worked as a Senior Bioinformatics Scientist at Harvard where he continued working on genomics and taught introductory data science workshops. Brian joined Princeton University in 2020 as Schmidt DataX fellow where he works on biomedical cloud computing with large data sets.

Michal Grzadkowski's profile photo

Michal Grzadkowski

Michal joined Princeton Research Computing in 2021 after five years working as a Research Software Engineer at Oregon Health & Science University, where his primary project involved studying the application of machine learning models to better understand the impacts of mutations commonly implicated in tumorigenesis. This involved implementing novel methods for representing the taxonomies of mutations present in cancer cohorts, as well as developing software for deploying and consolidating thousands of classification models on a high-performance compute cluster. His present work focuses on optimizing pipelines for generating quantitative assessments of the contributions various types of assets can make to a power grid’s ability to satisfy the demand for electricity over a given time frame.

Hosted By

PICSciE/Research Computing | View More Events
Co-hosted with: GradFUTURES

Contact the organizers