Together Duke: Advancing Excellence Through Community

## Machine Learning Summer School

### June 17-21, 2019

**Summary**

Machine learning is a field characterized by development of algorithms that are implemented in software and run on a machine (e.g., computer, mobile device, etc.). Each such algorithm is characterized by a set of parameters, and particular parameter settings yield associated algorithm characteristics. The algorithms have the capacity to learn, based on observed data. By “learn” it is meant that the algorithm can infer (or learn) which algorithm parameter settings are best matched to the data of interest. After algorithm parameters are so learned, the associated model ideally captures the underlying characteristics of the data. The algorithm, with learned parameters, may subsequently be applied to new data, with the goal of making predictions or learning insights. Machine learning methodology is primarily concerned with designing appropriate models/algorithms for datasets and problems of interest, plus the capacity to learn the model parameters given data (with challenges manifested when that data is of a massive scale).

In the context of prediction, one may be interested in developing algorithms that are capable of automatically interpreting data in a healthcare setting, to improve clinical care. In this case, the healthcare data may be radiological images, doctor notes, and/or a history of patient care (e.g., previous diagnoses, medications taken, etc.). In healthcare, the goal is to use machine learning to make improved diagnoses and recommendations for care. Similar concepts are of interest in business, where one may be interested in tailoring advertising and products to individuals. In education, machine learning may be used to tailor educational material to the level and interests of each student. Machine learning is increasingly making an impact in almost all areas of personal and professional life.

Recently, with increasing access to massive datasets, and to significant advances in computing resources, the quality of machine learning performance (e.g., prediction accuracy) has improved markedly. Further, over the last five years, significant advances have been made in a subfield of machine learning called “deep learning.”

This class will focus on the areas of machine learning that have made the biggest advances in utility over the last several years, including deep learning. The class will concentrate on methods that allow machine-learning algorithms to train effectively on massive datasets, i.e., “big data.” Emphasis will be placed on the latest methods for image and video analysis, natural language processing, reinforcement learning, and data synthesis/modeling.

Professor Lawrence Carin, of the Duke Electrical & Computer Engineering Department, will lead the MLSS, and several other Duke professors will also lecture. Hands-on training with software will be assisted by Duke undergraduate and graduate students, who have extensive experience with these tools. Case studies will also be presented, of machine learning in practice.

**Who Should Attend**

The Machine Learning Summer School (MLSS) is targeted to individuals interested in learning about machine learning, with a focus on recent algorithms, like deep learning. The MLSS will introduce the mathematics and statistics at the foundation of modern machine learning. Additionally, the MLSS will provide hands-on training in the latest machine learning software, using the widely used (and free) Google TensorFlow platform.

All students at the MLSS should have a background in computing (e.g., with Python), to at a minimum be capable of learning how to use and apply modern machine learning software. For the subset of students who also have a strong mathematical and statistical background (strength in calculus and in basic statistics, at the senior undergraduate level), the portion of the MLSS devoted to understanding the fundamentals of machine learning will also be most accessible. Strength in mathematics and statistics is a significant plus, and will make all MLSS material accessible; however, it is not required to benefit from the hands-on software portion of the program.

**Curriculum**

The broad areas of emphasis for the five-day class are as follows.

Monday:

- Basic concepts in machine learning
- Introduction to model building
- Scaling to “big data” with stochastic gradient descent
- Backpropagation as an efficient computation method

Tuesday:

- Deep convolutional neural networks
- Image analysis
- Image segmentation, object detection and object localization

Wednesday:

- Methods for natural language processing
- Word embeddings
- Recurrent neural networks
- Temporal convolutional neural networks
- Transformer networks

Thursday:

- Data synthesis, with an emphasis on images
- Generative adversarial network (GAN)
- Deep networks for GAN
- Learning and applications for GAN

Friday:

- Reinforcement Learning
- Basic concepts for optimal policies for in complex environment
- Q-learning and leveraging deep networks
- Applications of reinforcement learn

**Program Format**

The five-day class will provide lectures on the mathematics and statistics at the heart of machine learning, plus hands-on training on implementing machine learning tools with the TensorFlow software platform.

Each day of the MLSS will be arranged as follows (see detailed schedule):

- 9:00-10:15am Lecture 1: Mathematically-light introduction to the focus of the day
- 10:45am-noon Lecture 2: Mathematically rigorous discussion of the focus of the day

- 1:30-3:00pm Software discussion and hands-on training with TensorFlow
- 3:30-4:30pm Case Study of machine learning in practice

At the end of the MLSS, each student should be able to utilize TensorFlow to implement the latest machine learning methods for analysis of images, video and natural language (text). For those students with sufficient mathematical background, the underlying methodology of machine learning will also be learned. Students will be given assignments to test their knowledge of the material taught, such that they can get a sense of their absorption of concepts.

**Program Details: Location, Registration and Cost**

MLSS is being held in Schiciano Auditorium, which is in the Fitzpatrick Center for Interdisciplinary Engineering, Medicine and Applied Sciences (FCIEMAS) on Duke’s West Campus. Visitor parking is available in the nearby Bryan Center Parking Garage.

Students (with a valid ID, at Duke or other universities) will pay a course fee of $500; the fee for non-students is $1,000, payable through the registration site. * All fees are non-refundable. *Once we reach maximum registration, we will maintain a waitlist, and will contact those on the waitlist as spots become available.

**Lecturers**

Lecturers in MLSS include:

Registration for the June 2019 Machine Learning Summer School closes on June 11, 2019 at 11:59pm. For help or for more information, contact Carolyn Mackman at carolyn.mackman@duke.edu.