Machine Learning with R: A Hands-On Introduction
R offers a wide variety of machine learning (ML) functions, each of which works in a slightly different way. This one-day, hands-on workshop starts with ML basics and takes you step-by-step through increasingly complex modeling styles. This workshop makes ML modeling easier through the use of packages that standardize the way the various functions work. When finished, you should be able to use R to apply the most popular and effective machine learning models to make predictions and assess the likely accuracy of those predictions.
The instructor will guide attendees on hands-on execution with R, covering:
- A brief introduction to R’s tidyverse functions, including a comparison of the caret and parsnip packages
- Pre-processing data
- Selecting variables
- Partitioning data for model development and validation
- Setting model training controls
- Developing predictive models using naïve Bayes, classification and regression trees, random forests, gradient boosting machines, and neural networks (more, if time permits)
- Evaluating model effectiveness using measures of accuracy and visualization
- Interpreting what “black-box” models are doing internally
Hardware: Bring Your Own Laptop
Each workshop participant is required to bring their laptop.
- Workshop starts at 8:30am PDT
- AM Break from 10:00 – 10:15am PDT
- Lunch Break from 12:00am – 12:45pm PDT
- PM Break: 2:15 – 2:30pm PDT
- End of the Workshop: 4:30pm PDT
Jared P. Lander, Chief Data Scientist, Lander Analytics
Jared P. Lander is Chief Data Scientist of Lander Analytics, the Organizer of the New York Open Statistical Programming Meetup and the New York and Government R Conferences, an Adjunct Professor at Columbia Business School, and a Visiting Lecturer at Princeton University. With a masters from Columbia University in statistics and a bachelors from Muhlenberg College in mathematics, he has experience in both academic research and industry. Jared oversees the long-term direction of the company and acts as Lead Data Scientist, researching the best strategy, models and algorithms for modern data needs. This is in addition to his client-facing consulting and training. He specializes in data management, multilevel models, machine learning, generalized linear models, data management, visualization and statistical computing. He is the author of R for Everyone (now in its second edition), a book about R Programming geared toward Data Scientists and Non-Statisticians alike. The book is available from Amazon, Barnes & Noble and InformIT. The material is drawn from the classes he teaches at Columbia and is incorporated into his corporate training. Very active in the data community, Jared is a frequent speaker at conferences, universities and meetups around the world.