Awesome
PyCon 2012 - PyData sprint
Sandbox to collaborate during the pycon sprint on distributed data analytcis related issues.
Stuff to investigate
- Distributed grid search
- Efficient data broadcasting on the local stores of the nodes using a memoization pattern a la joblib.Memory
- Distributed random forests
- Leverage data locality from disco's DFS in IPython parallel engines