Awesome

#Apache Spark 2 for Beginners This is the code repository for Apache Spark 2 for Beginners, published by Packt. It contains all the supporting project files necessary to work through the book from start to finish. ##Instructions and Navigations All of the code is organized into folders. Each folder starts with a number followed by the application name. For example, Chapter02.

Software and Hardware List

Chapter number	Software required (with version)	Free/Proprietary	If proprietary, can code testing be performed using a trial version	If proprietary, then cost of the software	Download links to the software	Hardware specifications	OS required
All	Apache Spark 2.0.0	Free	NA	NA	http://spark.apache.org/downloads.html	X86	UNIX or MacOSX
6	Apache Kafka 0.9.0.0	Free	NA	NA	http://www.sublimetext.com/3	X86	UNIX or MacOSX

Detailed installation steps (software-wise)

The steps should be listed in a way that it prepares the system environment to be able to test the codes of the book. ###1. Apache Spark: a. Download Spark version mentioned in the table b. Build Spark from source or use the binary download and follow the detailed instructions given in the page http://spark.apache.org/docs/latest/building-spark.html c. If building Spark from source, make sure that the R profile is also built and the instructions to do that is given in the link given inthe step b. ###2. Apache Kafka a. Download Kafka version mentioned in the table b. The “quick start” section of the Kafka documentation gives the instructions to setup Kafka. http://kafka.apache.org/documentation.html#quickstart c. Apart from the installation instructions, the topic creation and the other Kafka setup pre-requisites have been covered in detail in the chapter of the book

The code will look like the following:

Python 3.5.0 (v3.5.0:374f501f4567, Sep 12 2015, 11:00:19)
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin

Spark 2.0.0 or above is to be installed on at least a standalone machine to run the code samples and do further activities to learn more about the subject. For Spark Stream Processing, Kafka needs to be installed and configured as a message broker with its command line producer producing messages and the application developed using Spark as a consumer of those messages.

##Related Products

Scala Design Patterns
Machine Learning with Spark
Quickstart Apache Axis2 ###Suggestions and Feedback Click here if you have any feedback or suggestions.