Home

Awesome

Spark

Course Info: University of Berkeley: Introduction to Big Data with Apache Spark

Prerequisites: http://ai.berkeley.edu/tutorial.html#PythonBasics http://www.mypythonquiz.com/

University website: https://courses.edx.org/dashboard Piazza discussion group: https://piazza.com/edx_berkeley/summer2015/cs1001x

Course Content: Week 1: Big Data and Data Science - Introduction to Big Data and Data Science - Performing Data Science and Preparing Data - Setting up the Course Software Environment

Week 2: Introduction to Apache Spark
	- Big Data, Hardware Trends, and the History of  Apache Spark
	- Spark Essentials
	- Lab 1: Learning Apache Spark

Week 3: Data Management
	- Semi-Structured Data
	- Structured Data
	- Lab 2: Web Server Log Analysis with Apache Spark

Week 4: Data Quality, Exploratory Data Analysis, and Machine Learning
	- Data Quality
	- Exploratory Data Analysis
	- Machine Learning - Spark's machine learning library, mllib 
	- Lab 3: Text Analysis and Entity Resolution

Week 5: Data Management
	- Lab 4: Introduction to Machine Learning with Apache Spark