Home

Awesome

<img src='https://s3.amazonaws.com/drivendata-public-assets/logo-white-blue.png' width='600'> <br><br>

Banner Image

Box-Plots for Education

Goal of the Competition

Budgets for schools and school districts are huge, complex, and unwieldy. It's no easy task to digest where and how schools are using their resources. Education Resource Strategies is a non-profit that tackles just this task with the goal of letting districts be smarter, more strategic, and more effective in their spending.

Your task is a multi-class-multi-label classification problem with the goal of attaching canonical labels to the freeform text in budget line items. These labels let ERS understand how schools are spending money and tailor their strategy recommendations to improve outcomes for students, teachers, and administrators.

What's in this Repository

This repository contains code volunteered from leading competitors in the Box-Plots for Education on DrivenData. Code for all winning solutions are open source under the MIT License.

Winning code for other DrivenData competitions is available in the competition-winners repository.

Winning Submissions

PlaceTeam or UserPublic ScorePrivate ScoreSummary of Model
1quocnle0.36650.3650My model is based on Online Learning, specifically a Logistic Regression model that uses the hashing trick and stochastic gradient descent with an adaptive learning rate.
2Abhishek0.44090.4388The problem was treated as an NLP problem rather than a machine learning problem with some structured dataset.
3giba0.45510.4534My approach is based in a Gradient Boosted Machine, so all text must be converted to an identification id (number).

Winner's Interview: Quoc Le