Awesome
Spark
Course Info: University of Berkeley: Introduction to Big Data with Apache Spark
Prerequisites: http://ai.berkeley.edu/tutorial.html#PythonBasics http://www.mypythonquiz.com/
University website: https://courses.edx.org/dashboard Piazza discussion group: https://piazza.com/edx_berkeley/summer2015/cs1001x
Course Content: Week 1: Big Data and Data Science - Introduction to Big Data and Data Science - Performing Data Science and Preparing Data - Setting up the Course Software Environment
Week 2: Introduction to Apache Spark
- Big Data, Hardware Trends, and the History of Apache Spark
- Spark Essentials
- Lab 1: Learning Apache Spark
Week 3: Data Management
- Semi-Structured Data
- Structured Data
- Lab 2: Web Server Log Analysis with Apache Spark
Week 4: Data Quality, Exploratory Data Analysis, and Machine Learning
- Data Quality
- Exploratory Data Analysis
- Machine Learning - Spark's machine learning library, mllib
- Lab 3: Text Analysis and Entity Resolution
Week 5: Data Management
- Lab 4: Introduction to Machine Learning with Apache Spark