Home

Awesome

A tool to evaluate the performance of various machine learning algorithms and preprocessing steps to find a good baseline for a given task.

Installation

pip install evaluate

Example

import evaluate
from sklearn import datasets

data = datasets.load_iris()
x, y = data.data, data.target

results = evaluate(task='classification', data=(x, y))
results['test_score'].plot.bar()

Documentation

This tool performs common preprocessing steps such as feature scaling, one-hot encoding etc., and runs various ML algorithms such as Random Forests, SVM etc. It then evaluates the performance of each preprocessing step and ML algorithm and provides scores for each. These results can be used to quickly identify preprocessing steps and ML algorithms that perform well to form a good baseline which can be used to develop better models.

evaluate(task,
         data,
         test_data=.2,
         columns=None,
         preprocessors=None,
         estimators=None)
Args
Returns

Dictionary of pandas DataFrames with estimator names as index and preprocessor names as column names with the following keys:

{
    'test_score': ...,
    'train_score': ...,
    'fit_time': ...,
    'score_time': ...,
}
results = evaluate(...)
assert isinstance(results, dict)
scores = results['test_score']
assert isinstance(scores, pandas.DataFrame)
scores.plot.bar()

Preprocessors

Available Preprocessors

NameColumn TypeDescription
nnumericHandle missing data
n:snumericStandardize features
ccategoricalHandle missing data and perform one-hot encoding
oordinalHandle missing data and perform ordinal encoding
t:ctextConvert to a matrix of token counts
t:c=2textConvert to a matrix of token counts including bigrams
t:ttextConvert to a matrix of TF-IDF features
t:t=2textConvert to a matrix of TF-IDF features including bigrams

Multiple preprocessors can be combined into one by separating them with ,:

results = evaluate(..., preprocessors=['n,c,o', 'n:s,c,o'])

Custom Preprocessors

Custom preprocessors can be added as:

from evaluate import evaluate, Preprocessors

preprocessors = Preprocessors()
preprocessors.add('custom_preprocessor', CustomPreprocessor())
results = evaluate(..., preprocessors=preprocessors)

Name of the custom preprocessor must be unique.

Estimators

Available Estimators

ClassificationRegression
XGBClassifierXGBRegressor
LGBMClassifierLGBMRegressor
RandomForestClassifierRandomForestRegressor
SVCSVR
LogisticRegressionLinearRegression
KNeighborsClassifierKNeighborsRegressor
AdaBoostClassifierAdaBoostRegressor
ExtraTreesClassifierExtraTreesRegressor
GradientBoostingClassifierGradientBoostingRegressor
DecisionTreeClassifierDecisionTreeRegressor
DummyClassifierDummyRegressor

Custom Estimators

Custom estimators can be added as:

from evaluate import evaluate, Estimators

estimators = Estimators(task='classification')
estimators.add('custom_estimator', CustomEstimator())
results = evaluate(..., estimators=estimators)

Name of the custom estimator must be unique.