Awesome

18.337J/6.338J: Parallel Computing and Scientific Machine Learning (Spring 2023)

Professor Alan Edelman (and Philip the Corgi)

MW 3:00 to 4:30 @ Room 2-190

TA and Office hours: (To be confirmed)

Piazza Link

Canvas will only be used for homework and project (+proposal) submission + lecture videos

Classes are recorded and will be uploaded on canvas. Another great resource is Chris Rackauckas' videos of 2021 spring class. See SciMLBook.

Julia:

Really nice Julia tutorial for the fall 2022 class Tutorial
Julia cheatsheets
Julia tutorial by Steven Johnson Wed Feb 8 Optional Julia Tutorial: Wed Feb 8 @ 5pm via Zoom
Virtually via Zoom. Recording will be posted.

A basic overview of the Julia programming environment for numerical computations that we will use in 18.06 for simple computational exploration. This (Zoom-based) tutorial will cover what Julia is and the basics of interaction, scalar/vector/matrix arithmetic, and plotting — we'll be using it as just a "fancy calculator" and no "real programming" will be required.

Tutorial materials (and links to other resources)

If possible, try to install Julia on your laptop beforehand using the instructions at the above link. Failing that, you can run Julia in the cloud (see instructions above).

Announcement:

There will be homeworks, followed by the final project. Everyone needs to present their work and submit a project report.

1-page Final Project proposal due : March 24

Final Project presentations : April 26 to May 15

Final Project reports due: May 15

Grading:

50% problem sets, 10% for the final project proposal, and 40% for the final project. Problem sets and final projects will be submitted electronically.

HW

#	Notebook
1	HW1
(For matrix calculus problems, do not use indices)
2	HW2 Due Wednesday March 1, 2023
3	HW3 Due Wednesday March 15, 2023
4	HW4 Due Wednesday April 19, 2023

Lecture Schedule (tentative)

#	Day	Date	Topic	SciML lecture	Materials
1	M	2/6	Intro to Julia. My Two Favorite Notebooks.		[Julia is fast], [AutoDiff], [autodiff video],
2	W	2/8	Matrix Calculus I and The Parallel Dream		See [IAP 2023 Class on Matrix Calculus],[handwritten notes],[The Parallel Dream]
3	M	2/13	Matrix Calculus II		[handwritten notes],[Corgi in the Washing Machine],[2x2 Matrix Jacobians]
4	W	2/15	Serial Performance	2	[handwritten notes], [Serial Performance .jl file], [Loop Fusion Blog ]
5	T	2/21	Intro to PINNs and Automatic differentiation I : Forward mode AD	3 and 8	ode and Pinns,intro to pinn handwritten notes,autodiff handwritten notes
6	W	2/22	Automatic differentiation II : Reverse mode AD	10	pinn.jl, reverse mode ad demo,handwritten notes
7	M	2/27	Dynamical Systems & Serial Performance on Iterations	4	Lorenz many ways, Dynamical Systems, handwriten notes
8	W	3/1	HPC & Threading	5 and 6	pi.jl, threads.jl,HPC Slides
9	M	3/6	Parallelism		Parallelism in Julia Slides,reduce/prefix notebook
10	W	3/8	Prefix (and more)		ppt slides, reduce/prefix notebook,ThreadedScans.jl,cuda blog
11	M	3/13	Adjoint Method Example	10	Handwritten Notes
12	W	3/15	Guest Lecture - Chris Rackauckas
13	M	3/21	Vectors, Operators and Adjoints		Handwritten Notes
14	W	3/23	Adjoints of Linear, Nonlinear, Ode	11	Handwritten Notes, 18.335 adjoint notes (Johnson)
Spring Break
15	M	4/3	Guest Lecture, Billy Moses		Enzyme AD
16	W	4/5	Guest Lecture, Keaton Burns		Dedalus PDE Solver
17	M	4/10	Adjoints of ODE		Handwritten Notes
18	W	4/12	Partitioning
	M	4/17	Patriots' Day
19	W	4/19	Fast Multipole and Parallel Prefix		Unfinished Draft
20	M	4/24
21	W	4/26	Project Presentation I
22	M	5/1	Project Presentation II
23	W	5/3	Project Presentation III
24	M	5/8	Project Presentation IV
25	W	5/10	Project Presentation V
	M	5/15	Class Cancelled

|8|W|3/1| GPU Parallelism I |7| [video 1],[video2] |9|M|3/6| GPU Paralellism II | | [video], [Eig&SVD derivatives notebooks], [2022 IAP Class Matrix Calculus] |10|W|3/8| MPI | | Slides, [video, Lauren Milichen],[Performance Metrics] see p317,15.6 |11|M|3/13| Differential Equations I | 9| |12|W|3/15| Differential Equations II |10 | |13|M|3/20| Neural ODE |11 | |14|W|3/22| |13 | | | | | Spring Break | |15|M|4/3| | | GPU Slides Prefix Materials |16|W|4/5| Convolutions and PDEs | 14 | |17|M|4/10| Chris R on ode adjoints, PRAM Model |11 | [video]| |18|W|4/12| Linear and Nonlinear System Adjoints | 11 | [video]| | |M|4/17| Patriots' Day |19|W|4/19| Lagrange Multipliers, Spectral Partitioning || Partitioning Slides| | |20|M|4/24| |15| [video],notes on adjoint| |21|W|4/26| Project Presentation I | |22|M|5/1| Project Presentation II | Materials |23|W|5/3| Project Presentation III | 16 | [video] |24|M|5/8| Project Presentation IV |
|25|W|5/10| Project Presentation V | |26|M|5/15| Project Presentation VI|

Lecture Summaries and Handouts

Class Videos

Lecture 1: Syllabus, Introduction to Performance, Introduction to Automatic Differentiation

Setting the stage for this course which will involve high performance computing, mathematics, and scientific machine learning, we looked at two introductory notebooks. The first [Julia is fast]](https://github.com/mitmath/18337/blob/master/lecture1/Julia%20is%20fast.ipynb) primarily reveals just how much performance languages like Python can leave on the table. Many people don't compare languages, so they are unlikely to be aware. The second [AutoDiff]](https://github.com/mitmath/18337/blob/master/lecture1/AutoDiff.ipynb) reveals the "magic" of forward mode autodifferentiation showing how a compiler can "rewrite" a program through the use of software overloading and still maintain performance. This is a whole new way to see calculus, not the way you learned it in a first year class, and not finite differences either.

Lecture 2: The Parallel Dream and Intro to Matrix Calculus

We gave an example [The Parallel Dream]](https://github.com/mitmath/18337/blob/master/lecture1/the_dream.ipynb)

Lecture and Notes

Homeworks

HW1 will be due Thursday Feb 16. This is really just a getting started homework.

Hw1

Final Project

For the second half of the class students will work on the final project. A one-page final project proposal must be sumbitted by March 24 Friday, through canvas.

Last three weeks (tentative) will be student presentations.

Possible Project Topics

Here's a list of current projects of interest to the julialab

One possibility is to review an interesting algorithm not covered in the course and develop a high performance implementation. Some examples include:

High performance PDE solvers for specific PDEs like Navier-Stokes
Common high performance algorithms (Ex: Jacobian-Free Newton Krylov for PDEs)
Recreation of a parameter sensitivity study in a field like biology, pharmacology, or climate science
Augmented Neural Ordinary Differential Equations
Neural Jump Stochastic Differential Equations
Parallelized stencil calculations
Distributed linear algebra kernels
Parallel implementations of statistical libraries, such as survival statistics or linear models for big data. Here's one example parallel library) and a second example.
Parallelization of data analysis methods
Type-generic implementations of sparse linear algebra methods
A fast regex library
Math library primitives (exp, log, etc.)

Another possibility is to work on state-of-the-art performance engineering. This would be implementing a new auto-parallelization or performance enhancement. For these types of projects, implementing an application for benchmarking is not required, and one can instead benchmark the effects on already existing code to find cases where it is beneficial (or leads to performance regressions). Possible examples are:

Create a system for automatic multithreaded parallelism of array operations and see what kinds of packages end up more efficient
Setup BLAS with a PARTR backend and investigate the downstream effects on multithreaded code like an existing PDE solver
Investigate the effects of work-stealing in multithreaded loops
Fast parallelized type-generic FFT. Starter code by Steven Johnson (creator of FFTW) and Yingbo Ma can be found here
Type-generic BLAS. Starter code can be found here
Implementation of parallelized map-reduce methods. For example, pmapreduce extension to pmap that adds a paralellized reduction, or a fast GPU-based map-reduce.
Investigating auto-compilation of full package codes to GPUs using tools like CUDAnative and/or GPUifyLoops.
Investigating alternative implementations of databases and dataframes. NamedTuple backends of DataFrames, alternative type-stable DataFrames, defaults for CSV reading and other large-table formats like JuliaDB.

Additionally, Scientific Machine Learning is a wide open field with lots of low hanging fruit. Instead of a review, a suitable research project can be used for chosen for the final project. Possibilities include:

Acceleration methods for adjoints of differential equations
Improved methods for Physics-Informed Neural Networks
New applications of neural differential equations
Parallelized implicit ODE solvers for large ODE systems
GPU-parallelized ODE/SDE solvers for small systems