Awesome
Clickstream & Social Analysis
This repository contains is a couple of examples of using Apache Spark to process social media data (JSON) into an abstract 'Interaction' we want to analyse.
Acquiring Data
The data used in this example came from streams of Facebook data provided by Datasift. While we cannot redistribute the data we demonstrated, you can acquire it yourself using Datasift for around $5 a day.
Running from Eclipse
If you'd like to import and use this project from Eclipse, make sure you have SBT 0.13+ installed and run the following:
sbt eclipse
This will generate the Eclipse project metadata and you can use File -> Import to load it into your workspace.
Building a JAR
You can also submit this job to Apache Spark as a JAR file using sbt assembly
to build the project and spark-submit
to run the Job on your existing cluster.