Awesome
awesome-dl-hw-resources
A curated list of awesome hardware/chip design resources for deep learning
Step 1: Get Inspired
Simon Knowles Talk about developing Intelligent Machines https://www.youtube.com/watch?v=tyW9x5ROl2E
Future of AI Hardware panel discussion https://vimeo.com/238818665
Existing lists:
https://amundtveit.com/2017/07/12/deep-learning-for-embedded-systems/
https://github.com/Piyush3dB/awesome-deep-computation
Energy Estimation
From Vivienne Sze's Lab https://energyestimation.mit.edu/
Machine Learning for chip design
- Self-Optimizing Memory Controllers: A Reinforcement Learning Approach https://people.inf.ethz.ch/omutlu/pub/rlmc_isca08.pdf
Chip Design for Machine Learning
Surveys:
Efficient Processing of Deep Neural Networks: A Tutorial and Survey (https://arxiv.org/abs/1703.09039)
Recent advances in efficient computation of deep convolutional neural networks https://link.springer.com/content/pdf/10.1631%2FFITEE.1700789.pdf
Dissertation: EFFICIENT METHODS AND HARDWARE FOR DEEP LEARNING https://stacks.stanford.edu/file/druid:qf934gh3708/EFFICIENT%20METHODS%20AND%20HARDWARE%20FOR%20DEEP%20LEARNING-augmented.pdf
Neural-inspired & neuromorphic computing http://www.sciencedirect.com/science/article/pii/S2212683X16300561
16 Views of Hot Chips ‘17 http://www.eetimes.com/document.asp?doc_id=1332192
Papers
Google TPU1 : https://arxiv.org/abs/1704.04760
Optimizing for Fisher's bound by bringing in HPC concepts on a chip https://arxiv.org/pdf/1705.05983.pdf
An Architecture to Accelerate Convolution in Deep Neural Networks https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8070363
Implementation
CNN Hardware Accelerator. http://cs231n.stanford.edu/reports/2017/pdfs/116.pdf https://github.com/kkiningh/cs231n-project
Lectures
Systolic Arrays https://www.youtube.com/watch?v=m_-zjdX7Lmw&t=2668s
Talks
Graphcore :
DeepPhi:
- Efficient Methods and Hardware for Deep Learning https://www.youtube.com/watch?v=eZdOkDtYMoo&index=69
Google TPU:
- Dave Patterson's Berkeley Talk https://www.youtube.com/watch?v=fhHAArxwzvQ
- Jeff Dean's Systems & Machine Learning Talk https://www.youtube.com/watch?v=PWv4ROEvqmk
Nvidia:
- High-Performance Hardware for Machine Learning https://www.youtube.com/watch?v=6oofOSxwUvA
- Bill Dally's Talk https://www.youtube.com/watch?v=h3QKvUPg_AI
Companies
Graphcore:
- Preliminary IPU benchmarks https://www.graphcore.ai/posts/preliminary-ipu-benchmarks-providing-previously-unseen-performance-for-a-range-of-machine-learning-applications
Nvidia:
Wave Computing:
Microsoft:
- Brainwave Slides at https://www.microsoft.com/en-us/research/blog/microsoft-unveils-project-brainwave/?utm_source=t.co&utm_medium=referral
- At the edge https://www.youtube.com/watch?v=5ZDYWFXrhl8&t=176s
ThinCI:
- Graph Processor http://www.eetimes.com/document.asp?doc_id=1332176
- QnA http://www.eetimes.com/document.asp?doc_id=1332159
Baidu:
- XPU https://www.nextplatform.com/2017/08/22/first-look-baidus-custom-ai-analytics-processor/
- Mixed Precision Training https://arxiv.org/abs/1710.03740
Huawei:
DL on Embedded Devices
Anirudh Kaul's presentation: https://www.slideshare.net/anirudhkoul/squeezing-deep-learning-into-mobile-phones
Pete Warden's blog https://petewarden.com/2017/06/22/what-ive-learned-about-neural-network-quantization/
Pete Warden's book: http://www.oreilly.com/data/free/building-mobile-applications-with-tensorflow.csp
Discussion about B/W, Compute restrictions on Embedded Devices https://www.youtube.com/watch?v=FATXK4yyaD0
Stick it
Movidius NN Compute Stick http://uk.rs-online.com/web/p/processor-microcontroller-development-kits/1393655/
Quantization:
https://www.tensorflow.org/performance/quantization
Low Precision Math:
General Matrix Multiplication in Low Precision https://github.com/google/gemmlowp
Arm's Math Library http://arm-software.github.io/CMSIS_5/DSP/html/index.html
8-bit compression https://arxiv.org/abs/1511.04561
Self Driving Car Compute
Talk about Compute requirements of Google's(Waymo) self driving cars by Daniel Rosenband https://www.youtube.com/watch?v=V_KLfSClcHg
Voyage's CEO Oliver Cameron's write-up on compute requirements: https://news.voyage.auto/under-the-hood-of-a-self-driving-car-78e8bbce62a6
Udacity Carla's internals https://medium.com/udacity/how-the-udacity-self-driving-car-works-575365270a40
DL Hardware Choice
Which GPU for deep learning ? http://timdettmers.com/2017/04/09/which-gpu-for-deep-learning/