Multi-GPU Training with PyTorch and TensorFlow

by PICSciE/Research Computing

Training/Workshop Research & Data Analysis

Thu, Feb 24, 2022

4 PM – 5:30 PM EST (GMT-5)

Add to Calendar

Online Event

37
Registered

Registration

Details

The first portion of this workshop will show participants how to optimize single-GPU training. The concepts of multi-GPU training will be discussed before demonstrating the use of Distributed Data Parallel (DDP) in PyTorch. Other distributed deep learning frameworks will be discussed. While the workshop is focused on PyTorch, demonstrations for TensorFlow will be available.

Knowledge prerequisites: Participants should be familiar with training neural networks with PyTorch or TensorFlow using a GPU.

Hardware/software prerequisites: For this workshop, users must have an account on the Adroit cluster, and they should confirm that they can SSH into Adroit *at least 48 hours beforehand*. Details can be found in this guide. THERE WILL BE LITTLE TO NO TROUBLESHOOTING DURING THE WORKSHOP!

Workshop format: Lecture, demonstration and hands-on

Learning objectives: Attendees will learn how to accelerate the training of neural networks using distributed deep learning frameworks.
 

Speakers

Jonathan Halverson's profile photo

Jonathan Halverson

Research Software and Computing Training Lead

Princeton University

Jonathan Halverson is the Research Software and Computing Training Lead with PICSciE and Research Computing. He has an expertise in data science and he is a founding organizer of the TensorFlow & PyTorch User Group at Princeton. Prior to his current position, Jonathan performed polymer physics research at the Max Planck Institute for Polymer Research and nanoscience research at Brookhaven National Laboratory. He holds a Ph.D. in Chemical Engineering from CUNY.

Hosted By

PICSciE/Research Computing | View More Events
Co-hosted with: GradFUTURES

Contact the organizers