Home

Awesome

awesome-ApacheSpark-collections or Awesome Spark

Book keeping of Apache Spark web search!
Also a curated list of awesome Apache Spark packages and resources.

Other github awesome links

Online Free Clusters

Notebooks and IDEs

Books on Apache Spark

Blogs

Must Read list

##Introduction

Spark + Hadoop

Spark Internals

SparkSQL

Streaming

Spark on GPU / DeepLearning

Tips & Tricks

-http://blog.smaato.com/tuning-spark-streaming-applications/

Spark Packages

Videos on Apache Spark

Channels

Playlists

Github Projects - Ever Growing List!

Setup

  1. https://github.com/clearstorydata-cookbooks/apache_spark
  2. https://github.com/gwik/spark-cookbook
  3. https://github.com/azavea/ansible-spark
  4. https://github.com/tzolov/apache-spark-build-pipeline
  5. https://github.com/aur-atomica-net/apache-spark
  6. https://github.com/GELOG/docker-ubuntu-spark
  7. https://github.com/kbastani/spark-neo4j

Spark Internals

  1. https://github.com/JerryLead/SparkInternals

Spark Learning/Workshop

  1. https://github.com/Mageswaran1989/aja
  2. https://github.com/deanwampler/spark-workshop
  3. https://github.com/ceteri/spark-exercises
  4. https://github.com/lenards/explore-spark
  5. https://github.com/seglo/learning-spark
  6. https://github.com/ceteri/intro_spark
  7. https://github.com/HadoopTW/CS100.1x
  8. https://github.com/EvanZ/myvagrant
  9. https://github.com/zfz/spark-cs100.1x
  10. https://github.com/StephenHarrington/spark
  11. https://github.com/gudiseva/Spark
  12. https://github.com/hoangtamvo/spark
  13. https://github.com/okaram/spark
  14. https://github.com/linshiu/spark
  15. https://github.com/jingjinggu/Apache_Spark
  16. https://github.com/aur-atomica-net/apache-spark
  17. https://github.com/dhesse/SparkTalk
  18. https://github.com/adamliesko/bigdata-spark
  19. https://github.com/skrusche63/spark-connect
  20. https://github.com/spirom/LearningSpark

Spark

  1. https://github.com/hohonuuli/sparknotebook
  2. https://github.com/googlegenomics/spark-examples
  3. https://github.com/sujee81/SparkApps
  4. https://github.com/praveensripati/spark-examples
  5. https://github.com/jdutton/spark-playground
  6. https://github.com/arjones/spark-news
  7. https://github.com/felixcheung/spark-notebook-examples
  8. https://github.com/manku-timma/spark
  9. https://github.com/joseratts/Spark
  10. https://github.com/giocode/SparkTutorial
  11. https://github.com/eenov8/apacheSpark
  12. https://github.com/yu-iskw/spark-dataframe-introduction
  13. https://github.com/rajanpupa/ApacheSparkExample
  14. https://github.com/XD-DENG/Spark-practice

Streaming

  1. https://github.com/prabeesh/SparkTwitterAnalysis
  2. https://github.com/cotdp/spark-example-clickstream-social
  3. https://github.com/ippontech/metrics-spark-receiver
  4. https://github.com/aleph-w/ApacheSparkLearning

Sql

  1. https://github.com/rnamboodiri/spark-cassandra-integrations
  2. https://github.com/choi258/Spark_apache

MLLib

  1. https://github.com/OndraFiedler/spark-recommender
  2. https://github.com/marklit/recommend
  3. https://github.com/staple/spark-agd
  4. https://github.com/tizfa/sparkboost
  5. https://github.com/rahmanusta/Spark-Bayes
  6. https://github.com/spacedotworks/decisiontree_ApacheSpark

Spark Machine Learning

  1. https://github.com/PredictionIO/PredictionIO
  2. https://github.com/BaiGang/spark_multiboost
  3. https://github.com/alitouka/spark_dbscan
  4. https://github.com/amplab/keystone
  5. https://github.com/krasserm/akka-analytics

Spark Streaming

  1. https://github.com/miguno/kafka-storm-starter
  2. https://github.com/killrweather/killrweather
  3. https://github.com/NFLabs/ambari
  4. https://github.com/rustyrazorblade/killranalytics

Spark + Visulization

  1. https://github.com/FRosner/spawncamping-dds

Spark + WebServer

  1. https://github.com/calrissian/spark-jetty-server

Spark + REST

  1. https://github.com/spark-jobserver/spark-jobserver

Spark + Cassendra

  1. https://github.com/datastax/spark-cassandra-connector

Spark + NoSQL datastore

  1. https://github.com/Stratio/deep-spark
  2. https://github.com/RussellSpitzer/spark-cassandra-csv
  3. https://github.com/haosdent/spark-hbase

Spark + Elastic search

  1. https://github.com/skrusche63/spark-elastic
  2. https://github.com/mhausenblas/elsa
  3. https://github.com/SHSE/spark-es

Spark + Azure + PowerBI

  1. https://github.com/granturing/spark-power-bi

Spark + Genomics

  1. https://github.com/bigdatagenomics/adam

Spark + Ruby

  1. https://github.com/ondra-m/ruby-spark

Usefull Addons

  1. https://github.com/amplab/spark-indexedrdd
  2. https://github.com/mrsqueeze/spark-hash
  3. https://github.com/simplymeasured/phoenix-spark
  4. https://github.com/calrissian/spark-jetty-server
  5. https://github.com/cloudera/spark-timeseries
  6. https://github.com/skrusche63/spark-weblog

Tools

  1. https://github.com/andypetrella/spark-notebook
  2. https://github.com/ibm-et/spark-kernel
  3. https://github.com/mraad/SparkProject
  4. https://github.com/saurfang/sbt-spark-submit