
Big Data Analytics and Machine Learning with Spark
Registration
Registration is now closed (this event already took place).
Details
Knowledge prerequisites: Participants should be familiar with Python or R and have some knowledge of data analytics. The workshop assumes that attendees are new to working with big data.
Hardware/software prerequisites: For this workshop, users must have an account on the Adroit cluster, and they should confirm that they can SSH into Adroit *at least 48 hours beforehand*. Details can be found in this guide. THERE WILL BE LITTLE TO NO TROUBLESHOOTING DURING THE WORKSHOP!
Workshop format: Lecture, demonstration and hands-on
Learning objectives: Attendees will learn how to use Apache Spark to analyze large datasets and train machine learning models.
Speakers
Jonathan Halverson
Research Software and Computing Training Lead
Princeton University
Jonathan Halverson is the Research Software and Computing Training Lead with PICSciE and Research Computing. He has an expertise in data science and he is a founding organizer of the TensorFlow & PyTorch User Group at Princeton. Prior to his current position, Jonathan performed polymer physics research at the Max Planck Institute for Polymer Research and nanoscience research at Brookhaven National Laboratory. He holds a Ph.D. in Chemical Engineering from CUNY.
Hosted By
Co-hosted with: GradFUTURES
Contact the organizers