Awesome
JPMML-LightGBM
Java library and command-line application for converting LightGBM models to PMML.
Prerequisites
- LightGBM 2.0.0 or newer.
- Java 1.8 or newer.
Installation
Enter the project root directory and build using Apache Maven:
mvn clean install
The build produces a library JAR file pmml-lightgbm/target/pmml-lightgbm-1.5-SNAPSHOT.jar
, and an executable uber-JAR file pmml-lightgbm-example/target/pmml-lightgbm-example-executable-1.5-SNAPSHOT.jar
.
Usage
A typical workflow can be summarized as follows:
- Use LightGBM to train a model.
- Save the model to a text file in a local filesystem.
- Use the JPMML-LightGBM command-line converter application to turn this text file to a PMML file.
The LightGBM side of operations
Training a binary classification model using the `Audit.csv dataset.
R language
library("lightgbm")
df = read.csv("Audit.csv", stringsAsFactors = TRUE)
# Three continuous features, followed by five categorical features
X = df[c("Age", "Hours", "Income", "Education", "Employment", "Gender", "Marital", "Occupation")]
y = df[["Adjusted"]]
cat_cols = c("Education", "Employment", "Gender", "Marital", "Occupation")
for(cat_col in cat_cols){
X[[cat_col]] = as.numeric(X[[cat_col]])
}
audit.matrix = as.matrix(X)
audit.ds = lgb.Dataset(data = audit.matrix, label = y, categorical_feature = cat_cols)
audit.lgbm = lgb.train(params = list(objective = "binary"), data = audit.ds, nrounds = 131)
lgb.save(audit.lgbm, "LightGBMAudit.txt")
Python language
import lightgbm
import pandas
df = pandas.read_csv("Audit.csv")
# Three continuous features, followed by five categorical features
X = df[["Age", "Hours", "Income", "Education", "Employment", "Gender", "Marital", "Occupation"]]
y = df["Adjusted"]
cat_cols = ["Education", "Employment", "Gender", "Marital", "Occupation"]
for cat_col in cat_cols:
X[cat_col] = X[cat_col].astype("category")
audit_ds = lightgbm.Dataset(data = X, label = y, categorical_feature = cat_cols)
audit_booster = lightgbm.train({"objective" : "binary", "num_iterations" : 131}, audit_ds)
audit_booster.save_model("LightGBMAudit.txt")
The JPMML-LightGBM side of operations
Converting the text file LightGBMAudit.txt
to a PMML file LightGBMAudit.pmml
:
java -jar pmml-lightgbm-example/target/pmml-lightgbm-example-executable-1.5-SNAPSHOT.jar --lgbm-input LightGBMAudit.txt --pmml-output LightGBMAudit.pmml
Getting help:
java -jar pmml-lightgbm-example/target/pmml-lightgbm-example-executable-1.5-SNAPSHOT.jar --help
Documentation
- Stacking Scikit-Learn, LightGBM and XGBoost models
- Deploying LightGBM models on Java/JVM platform
- Extending Scikit-Learn with GBDT+LR ensemble models (Using LightGBM models on the GBDT side of GBDT+LR ensemble)
- Converting Scikit-Learn LightGBM pipelines to PMML
License
JPMML-LightGBM is licensed under the terms and conditions of the GNU Affero General Public License, Version 3.0.
If you would like to use JPMML-LightGBM in a proprietary software project, then it is possible to enter into a licensing agreement which makes JPMML-LightGBM available under the terms and conditions of the BSD 3-Clause License instead.
Additional information
JPMML-LightGBM is developed and maintained by Openscoring Ltd, Estonia.
Interested in using Java PMML API software in your company? Please contact info@openscoring.io