Home

Awesome

OnlineLearning

Build Status Coverage Status

An implementation of online mini-batch learning for prediction in julia.

Learners

A Learner is fit by repeated calls to update!(l::Learner, x::DSMat{Float64}, y::Vector{Float64}) on mini-batches (x, y) of a dataset. Updating a learner incrementally optimizes some loss function. The loss function depends on the implementation of concrete subtypes of Learner. The actual optimization routine is implemented by an AbstractSGD object.

Values of the outcome are predicted with predict(l::Learner, x::DSMat{Float64}). The predict!(obj::Learner, pr::Vector{Float64}, x::DSMat{Float64}) method calculates predictions in place.

Features (x) can be either a dense or sparse matrix. (DSMat{T} is an alias for DenseMatrix{T} or SparseMatrixCSC{T, Ti <: Integer})

Available learners

The type of GLM is specified by GLMModel. Choices are:

Optimization

All of the learners require an optimizer of some sort. Currently, stochastic gradient descent type methods are provided by the AbstractSGD type.

An AbstractSGD implements an update!(obj::AbstractSGD{Float64}, weights::Vector{Float64}, gr::Vector{Float64}) method. This takes the current value of the weight(coefficient) vector and gradient and updates the weight vector in place. The AbstractSGD instance stores tuning parameters and step information, and may have additional storage additional storage for if necessary.

Available optimizers:

Notes

This is a work in progress. Most testing has been in simulations and not with real data. GLMLearner and GLMNetLearner with l_2 regularization seem to work pretty well. GLMNetLearner with l_1 regularization has not been thoroughly tested. Statistical performance tends to be pretty senstive to choice of optimizer and tuning parameters.

TODO