Home

Awesome

spark-tests

Build Status Coverage Status Maven Central

Utilities for writing tests that use Apache Spark.

SparkSuite: a SparkContext for each test suite

Add configuration options in subclasses using sparkConf(…), cf. KryoSparkSuite:

sparkConf(
  // Register this class as its own KryoRegistrator
  "spark.kryo.registrator" → getClass.getCanonicalName,
  "spark.serializer" → "org.apache.spark.serializer.KryoSerializer",
  "spark.kryo.referenceTracking" → referenceTracking.toString,
  "spark.kryo.registrationRequired" → registrationRequired.toString
)

PerCaseSuite: SparkContext for each test case

KryoSparkSuite

SparkSuite implementation that provides hooks for kryo-registration:

register(
  classOf[Foo],
  "org.foo.Bar",
  classOf[Bar] → new BarSerializer
)

Also useful for subclassing once per-project and filling in that project's default Kryo registrar, then having concrete tests subclass that; see cf. hammerlab/guacamole and hammerlab/pageant for examples.

Miscellaneous RDD / Job / Stage utilities