Awesome
ML.NET on Excel
This project showcases the consumption of ML.NET models in Excel as an add-in.
Statement of Purpose
This repository aims to grow the understanding of consuming ML.NET models in Excel by providing an example of implementation and references to support the Microsoft Build conference in 2022. It is not intended to be a released product. Therefore, this repository is not for discussing ML.NET, Excel add-in or requesting new features.
Table of Contents
- Project Structure
- Prerequisites
- Instructions for running the application
- Next Steps: Train your own model
- FAQ/Troubleshooting
Project Structure
- src
- MLModels: .NET Class Library used to train models
- MLBlazorExcelAddIn: Blazor WebAssembly application containing the Excel add-in.
- wwwroot/functions.json: Metadata about the custom functions contained in your add-in.
- js/interop.js: Code to set up JavaScript interop with Blazor and get data from Excel.
- Pages/Index.cshtml: Main page containing the UI displayed in the addin as well as the code to make predictions using ML.NET.
Prerequisites
-
Azure Storage Account with static website hosting enabled
RECOMMENDED: To simplify the process of setting up an Azure Storage account with static website hosting, use this ARM template:
Using the ARM template, create a new resource group or provide an existing one for your storage account to be placed in. Leave the defaults for the rest of the fields.
If you prefer to set up the storage account and static website manually, use the following guides to:
-
(Optional) Visual Studio 2022
Instructions
These instructions will help you get this project up and running.
Upload model to Azure Storage Account
-
In your Azure Storage account, create a new container called
models
using Azure Storage Explorer. For more information on how to create a container using Azure Storage Explorer, see Create a container. -
Set the public access level of the container to Public read access for blobs only. To do so using Azure Storage Explorer:
-
Right-click the
models
container and select Set Public Access Level...*. -
In the Set Container Public Access dialog, choose Public read access for blobs only.
-
Select Apply
-
-
Upload the IrisClassification.zip file in the MLModels project directory to the
models
container. Once uploaded, enable blob storage. For more information on uploading files to a container, see Upload blobs to the container. -
Copy the URL of your IrisClassification.zip blob to your clipboard. To do so in Azure Storage Explorer, right-click your blob and select Copy URL.
-
Open the Index.razor in the MLBlazorExcelAddIn/Pages directory in your preferred text editor.
-
Update the
_modelUrl
value with the URL you copied in previous steps._modelUrl = "YOUR-MODEL-BLOB-URL";
Update the add-in manifest.xml
- Open the manifest.xml file in the MLBlazorExcelAddIn/wwwroot directory in your preferred text editor.
- Find the text "STATIC-WEBSITE-URL" and replace all instances with the URL of your static website. For more information on finding your static website URL, see Find the website URL.
Build the MLBlazorExcelAddIn project
-
Open a terminal and navigate to the MLBlazorExcelAddIn project directory.
cd src/MLBlazorExcelAddIn
-
Enter the following command to build and publish the project:
dotnet publish -c Release
The output from the build process is stored in the MLBlazorExcelAddIn/bin/Release/net6.0/publish/wwwroot directory.
MacOS - If you run into an error similar to "The "GetFileHash" task failed unexpectedly" building and publishing your application, see the FAQ section.
Upload Excel Add-In to static website
Use Azure Storage Explorer to upload the contents from the MLBlazorExcelAddIn/bin/Release/net6.0/publish/wwwroot directory to the $web
Azure Blob Storage container. For more information on uploading files to your Azure static website container using Azure Storage Explorer, see Upload blobs to the container.
The contents of your $web
container should look similar to the following.
Configure CORS for your static website
-
Navigate to your Azure Storage resource in the Azure Portal.
-
Open the Settings > Resource sharing (CORS) page.
-
Select the Blob service tab.
-
Add entries for the following origins:
Allowed origins Allowed methods YOUR-STATIC-WEBSITE-URL GET 0.0.0.0 GET https://ppc-excel.officeapps.live.com GET Replace
YOUR-STATIC-WEBSITE-URL
with the URL you used in the manifest.xml file.NOTE: Make sure to remove the last backslash from the URL (i.e. https://<storage-acct-name>.z14.web.core.windows.net/ should be https://<storage-acct-name>.z14.web.core.windows.net)
Upload add-in
-
In your browser, navigate to excel.office.com.
-
Log in using your Microsoft Account.
-
Create a new blank workbook. For more information on how to create a new blank workbook, see Create a workbook in Excel for the web.
-
In the workbook, open the Office Add-ins dialog by selecting Insert > Office Add-Ins in the ribbon pane.
-
In the Office Add-Ins dialog, select Upload My Add-in to open up the upload add-in dialog.
-
Select Browse and use the file explorer to choose the manifest.xml file.
-
Select Upload.
After a few seconds, your add-in is loaded. For more information on uploading your add-in, see Sideload an Office Add-in in Office on the web manually
Use the add-in to make predictions
-
Add the following data to the first row of your workbook.
A B C D 5.1 3.5 1.4 0.2 -
Use the
MLNET.IRIS.PREDICT
custom function defined by your add-in to make a predictions using the data you previously entered as inputs.
Your prediction should display in the cell where you used the custom function.
Congratulations! You've now used an ML.NET model to make predictions in Excel!
Clean up resources
Remove Excel Add-In
To remove the Excel add-in from your account, clear your browser cache.
This sample was tested using Microsoft Edge. For more information on clearing your cache on Microsoft Edge, see View and delete browser history in Microsoft Edge
Delete Azure resources
The Azure resources that you created in this sample can incur ongoing costs. To avoid such costs, delete the resource group that contains all those resources. For more information on deleting resource groups, see Delete resource groups.
Next Steps: Train your own model
Now that you've used the sample to make predictions with a pretrained ML.NET model in Excel, it's time to train your own model. The machine learning task used for this section is still classification, but the scenario the model is used in is to categorize the risk level of restaurant violations found during health inspections.
Understand the data
The data set used to train and evaluate the machine learning model is originally from the San Francisco Department of Public Health Restaurant Safety Scores. For convenience, the dataset has been condensed to only include the columns relevant to train the model and make predictions. Visit the following website to learn more about the dataset.
Each row in the dataset contains information regarding violations observed during an inspection from the Health Department and a risk assessment of the threat those violations present to public health and safety.
InspectionType | ViolationDescription | RiskCategory |
---|---|---|
Routine - Unscheduled | Inadequately cleaned or sanitized food contact surfaces | Moderate Risk |
New Ownership | High risk vermin infestation | High Risk |
Routine - Unscheduled | Wiping cloths not clean or properly stored or inadequate sanitizer | Low Risk |
- InspectionType: the type of inspection. This can either be a first-time inspection for a new establishment, a routine inspection, a complaint inspection, and many other types.
- ViolationDescription: a description of the violation found during inspection.
- RiskCategory: the risk severity a violation poses to public health and safety.
The label
is the column you want to predict. When performing a classification task, the goal is to assign a category (text or numerical). In this classification scenario, the severity of the violation is assigned the value of low, moderate, or high risk. Therefore, the RiskCategory is the label. The features
are the inputs you give the model to predict the label. In this case, the InspectionType and ViolationDescription are used as features or inputs to predict the RiskCategory.
Prerequisites
In ML.NET, you have the option of using low-code tooling powered by AutoML like ML.NET Model Builder and the ML.NET CLI to train machine learning models. You also have the option of using the ML.NET API which provides all of the components required to train and consume models through code-first .NET experiences.
Model Builder is a Visual Studio extension that provides a wizard-like user-interface for Windows environments. The ML.NET CLI is a .NET global tool that provides similar functionality to Model Builder through the command-line for cross-platform experiences across x64 / ARM64 devices running Windows, Mac, or Linux. For this sample, you'll use the ML.NET CLI.
Train the model
If you prefer to train the model using Model Builder, see Classify the severity of restaurant health violations with Model Builder.
- Open the terminal and navigate to the MLNETExcel/src directory.
- Use the ML.NET CLI to train a classification model.
mlnet classification --dataset C:\Datasets\RestaurantScores.csv --has-header true --label-col risk_category --name InspectionClassification --train-time 60
Make sure to replace the --dataset
parameter with the location where you unzipped the RestaurantScores.csv file containing the dataset.
This command uses AutoML to search for the best algorithm and hyperparameters for your dataset and trains an ML.NET model in the span of 60 seconds (as specified by the --train-time
option).
Once the training process is complete, a new .NET console project containing the model as well as training and consumption code is created in the MLNETExcel/src/InspectionClassification directory.
Configure InspectionClassification project
By default, the ML.NET CLI creates a .NET console project that can be immediately used to make predictions with your newly trained model. The resulting output is an executable. However, since the model is intended to be consumed by the MLBlazorExcelAddIn
application and not as an executable, you need to configure the output type of the InspectionClassification.ConsoleApp
project.
-
Navigate to the MLNETExcel/src/InspectionClassification directory.
-
Open the InspectionClassification.ConsoleApp.csproj file in your preferred text editor.
-
Delete the following XML attribute:
<OutputType>Exe</OutputType>
Removing this attribute changes the output type from an executable to a Dynamically Linked Library (DLL).
-
Save your changes
-
(Optional) Delete the Program.cs file. Since you're not intending to run the application as an executable, there's no need for the application entrypoint defined in the Program.cs file.
-
Open the terminal inside the MLNETExcel/src/InspectionClassification directory.
-
Use the .NET CLI to clean the build output and rebuild the project.
dotnet clean dotnet build
In your build output directory (bin/Debug/net6.0) you should see a file InspectionClassification.ConsoleApp.dll file.
Add project reference
To use the inspection model in your MLBlazorExcelAddIn
project, add a reference to it.
-
Open the terminal and navigate to the MLNETExcel/src/MLBlazorExcelAddIn directory.
-
Use the .NET CLI to add a reference to the
InspectionClassification.ConsoleApp
project.dotnet add reference ..\InspectionClassification\InspectionClassification.ConsoleApp.csproj
-
Open the _Imports.razor file inside the MLNETExcel/src/MLBlazorExcelAddIn directory in your preferred text editor.
-
Add the following using statement to the file.
@using InspectionClassification.ConsoleApp
Upload inspection model to Azure Blob Storage
Use Azure Storage Explorer to upload the InspectionClassification.zip model file to the models
container. For more details, follow the guidance from the Upload model to Azure Storage Account section.
Update Index.razor
The Index.razor page contains the logic for loading the model from Azure Blob Storage and making predictions. Therefore it requires a few updates before using it to make predictions with the model.
-
Open the Index.razor page in the MLNETExcel/src/MLBlazorExcelAddIn in your preferred text editor.
-
Update the
_predictionEngine
variable declaration to use theInspectionClassification.ModelInput
andInspectionClassification.ModelOutput
schema classes.private PredictionEngine<InspectionClassification.ModelInput, InspectionClassification.ModelOutput> _predictionEngine;
-
Inside the
OnInitializedAsync
method:-
Update the
_modelUrl
variable with the the URL of the InspectionClassification.zip model file you recently uploaded to Azure Blob Storage. -
Update the value of
_predictionEngine
to use initialize a PredictionEngine with theINspectionClassification.ModelInput
andInspectionClassification.ModelOutput
classes._predictionEngine = mlContext.Model.CreatePredictionEngine<InspectionClassification.ModelInput,InspectionClassification.ModelOutput>(_model);
-
-
Replace the current
MakePrediction
method with the following:public string MakePrediction(string inspectionType, string violationDescription) { var input = new InspectionClassification.ModelInput { Inspection_type = inspectionType, Violation_description = violationDescription }; var prediction = _predictionEngine.Predict(input); return prediction.PredictedLabel; }
The main changes form the previous implementation are the parameters in the method signature which are used to create an instance of
InspectionClassification.ModelInput
that is assigned to theinput
variable.
Update JavaScript interop functions
The way predictions are made in Excel is using custom functions. The code to pass data from Excel to make predictions using the Blazor application is in the wwwroot/js/interop.js file inside the MLBlazorExcelAddIn
project.
-
Open the interop.js file in your preferred text editor.
-
Inside the
MLFunctions
class, update thepredict
function with the following code.static async predict(inspectionType, violationDescription) { const prediction = await MLFunctions.dotNetHelper.invokeMethodAsync('MakePrediction', inspectionType, violationDescription); console.log(prediction); return prediction; }
Like the
MakePrediction
method in theIndex.razor
file, the main changes from the previous implementation are the parameters in the function signature which are the inputs for theMakePrediction
function. -
Update the name of the function used in the in the custom function
associate
function toINSPECTION.PREDICT
.CustomFunctions.associate("INSPECTION.PREDICT", MLFunctions.predict);
Update Excel custom function schema
Now that you've updated the code, its time to make it discoverable. This is done through the functions.json file in the src/wwwroot directory inside the MLBlazorExcelAddIn
project. The functions.json contains metadata about the custom functions registered in your Excel Add-In.
-
Open the functions.json file in your preferred text editor.
-
Replace the contents of the file with the following:
{ "functions": [ { "description": "Categorize health inspections", "id": "INSPECTION.PREDICT", "name": "INSPECTION.PREDICT", "parameters": [ { "description": "Type of inspection", "name": "inspectionType", "type": "string" }, { "description": "Detailed explanation of violation", "name": "violationDescription", "type": "string" } ], "result": { "type": "string" } } ] }
The
description
,id
,name
, andparameters
properties have changed from the previous version to reflect your custom function for predicting violation severity.
Rebuild the project and update static website
Now that you've update the application, it's time to build a new version with the latest updates.
-
Open the terminal and navigate to the MLNETExcel/src/MLBlazorExcelAddIn directory.
-
Use the .NET CLI to build and publish your application.
dotnet clean dotnet publish -c Release
The output from the build process is stored in the MLBlazorExcelAddIn/bin/Release/net6.0/publish/wwwroot directory.
-
Use Azure Storage Explorer to upload the contents from the MLBlazorExcelAddIn/bin/Release/net6.0/publish/wwwroot directory to the
$web
Azure Blob Storage container. NOTE: Many of the files will already exist. To prevent conflicts, it's recommended that you delete the contents of the blob storage before uploading the new version of your application..
Re-upload manifest.xml
You shouldn't have to re-upload your manifest.xml file. However, if you cleared your cache or the add-in is no longer available in the Excel client, follow the steps in the Upload Excel Add-In section to upload the manifest.xml file to Excel.
Use the model to predict violation severity
-
Add the following data to the first row of your workbook.
A B Complaint Inadequate sewage or wastewater disposal -
Use the
MLNET.INSPECTION.PREDICT
custom function defined by your add-in to make a predictions using the data you previously entered as inputs.
Your prediction should display in the cell where you used the custom function.
Congratulations! You've now trained a custom ML.NET model and used it to make predictions in Excel!
Additional Next Steps
- Improve your model. Choose a longer training time. Doing so allows AutoML to explore more models and hyperparameters.
- Publish your office add-in. Doing so makes it easier for others in your organization to access your add-in. For more information on publishing add-ins, see Deploy and publish Office Add-Ins.
FAQ / Troubleshooting
The following are known issues with proposed workarounds
The "GetFileHash" task failed unexpectedly
On MacOs, you might fail to build the MLBlazorExcelAddIn
project and get a message like the following:
The "GetFileHash" task failed unexpectedly
This is a known issue in .NET.
To fix it, run the dotnet publish -c Release
command again.
$web container not public
In some instances you might need to make the $web
container public in order to view your website. For more information, see Set the public access level for a container.