Home

Awesome

closest-pair

A framework for Apache Mesos that given a file with a list of 2D points, calculates the closest distance between any two points in that set. Implements a divide-and-conquer algorithm which determines the result in O(nlogn) time instead of brute-force O(n^2).

Based on Rendler implementation.

Closest-Pair consists of two main components:

Quick Start with Vagrant

Requirements

Start the mesos-demo VM

$ wget http://downloads.mesosphere.io/demo/mesos.box -O /tmp/mesos.box
$ vagrant box add --name mesos-demo /tmp/mesos.box
$ git clone https://github.com/chenlily/closest-pair
$ vagrant up

Execution:

$ vagrant ssh
vagrant@mesos:~ $ cd hostfiles

# Update install dependencies
vagrant@mesos:cpp $ sudo apt-get update
vagrant@mesos:cpp $ sudo apt-get install libcurl4-openssl-dev libboost-regex1.55-dev \
          libprotobuf-dev libgoogle-glog-dev protobuf-compiler

# Build
vagrant@mesos:cpp $ make all

# Start the scheduler with the test file and the mesos master ip
vagrant@mesos:cpp $ ./closest-pair-scheduler testfile 127.0.1.1:5050
# <Ctrl+C> to stop...

Shutting down the mesos-demo VM

# Exit out of the VM
vagrant@mesos:hostfiles $ exit
# Stop the VM
$ vagrant halt
# To delete all traces of the vagrant machine
$ vagrant destroy

Test File Format

Points can be either integers or floating point

x1 y1
x2 y2
x3 y3 

Closest-Pair Architecture

Closest-Pair Scheduler

Data Structures

A point is represented by a struct x and y coordinates.

struct Point(
  double x,y; 
)

A split on an array into its left and right components is represented by a SplitInfo. It contains the left and right indices of its subarray, a parent ID, the minimum distances on its left and right halves, and also whether the split is a left or right child to its parent. If the split is the original/root array, the direction is set as NONE.

struct SplitInfo (
  int left, right parent; 
  double lMinDist, rMinDist; 

  // enum Direction {LEFT, RIGHT, NONE}; 
  Direction dir;   
)

Scheduler Behavior

The scheduler accepts a text file as command-line parameter to create a vector of points to calculate the minimum distance.

Combiner Executor