Awesome

icon

Mortred-AI-Web-Server: A Noob Web Server for AI Models

| English | 中文 |

</div>

Morted AI Model Server is a toy web server for deep learning models. Server tries its best to make the most usage of your cpu and gpu resources. All dl models are trained by tensorflow/pytorch and deployed via MNN toolkit and supply web service through workflow framework finally.

Do not hesitate to let me know if you find bugs here cause I'm a c-with-struct noob :upside_down_face:

The three major components are illustrated on the architecture picture bellow.

A quick overview and examples for both serving and model benchmarking are provided below. Detailed documentation and examples will be provided in the docs folder.

You're welcomed to ask questions and help me to make it better!

All models and detectors can be downloaded from my Hugging Face Page.

`Contents of this document`

Quick Start
Benchmark
Tutorials
How To
Web Server Configuration

`Quick Start`

Before proceeding further with this document, make sure you have the following prerequisites

1. Make sure you have CUDA&GPU&Driver rightly installed. You may refer to this to install them

2. Make sure you have MNN installed. For install instruction you may find some help here. MNN-2.7.0 release version was recommended.

3. Make sure you have WORKFLOW installed. For install instruction you may find some help here

4. Make sure you have OPENCV installed. For install instruction you may find some help here

5. Make sure your GCC tookit support cpp-17

6. Segment-Anything needs ONNXRUNTIME and TensorRT library. You may refer to this to install onnxruntime>=1.16.0 and this to install TensorRT-8.6.1.6

After all prerequisites are settled down you may start to build the mortred ai server frame work.

Setup :fire::fire::fire:

Step 1: Prepare 3rd-party Libraries

Copy MNN headers and libs

cp -r $MNN_ROOT_DIR/include/MNN ./3rd_party/include
cp $MNN_ROOT_DIR/build/libMNN.so ./3rd_party/libs
cp $MNN_ROOT_DIR/build/source/backend/cuda/libMNN_Cuda_Main.so ./3rd_party/libs

Copy WORKFLOW headers and libs

cp -r $WORKFLOW_ROOT_DIR/_include/workflow ./3rd_party/include
cp -r $WORKFLOW_ROOT_DIR/_lib/libworkflow.so* ./3rd_party/libs

Copy ONNXRUNTIME headers and libs

cp -r $ONNXRUNTIME_ROOT_DIR/include/* ./3rd_party/include/onnxruntime
cp -r $ONNXRUNTIME_ROOT_DIR/_lib/libonnxruntime*.so* ./3rd_party/libs

Copy TensorRT headers and libs

cp -r $TENSORRT_ROOT_DIR/include/* ./3rd_party/include/TensorRT-8.6.1.6
cp -r $TENSORRT_ROOT_DIR/_lib/libnvinfer.so* ./3rd_party/libs
cp -r $TENSORRT_ROOT_DIR/_lib/libnvinfer_builder_resource.so.8.6.1 ./3rd_party/libs
cp -r $TENSORRT_ROOT_DIR/_lib/libnvinfer_plugin.so* ./3rd_party/libs
cp -r $TENSORRT_ROOT_DIR/_lib/libnvonnxparser.so* ./3rd_party/libs

Step 2: Build Mortred AI Server :coffee::coffee::coffee:

mkdir build && cd build
cmake ..
make -j10

Step 3: Download Pre-Built Models :tea::tea::tea:

Download pre-built image models via BaiduNetDisk and extract code is 1y98. Create a directory named weights in $PROJECT_ROOT_DIR and unzip the downloaded models in it. The weights directory structure should looks like

Step 4: Test MobileNetv2 Benchmark Tool

The benchmark and server apps will be built in $PROJECT_ROOT_DIR/_bin and libs will be built in $PROJECT_ROOT_DIR/_lib. Benchmark the mobilenetv2 classification model

cd $PROJECT_ROOT_DIR/_bin
./mobilenetv2_benchmark.out ../conf/model/classification/mobilenetv2/mobilenetv2_config.ini

You should see the mobilenetv2 model benchmark profile as follows:

Step 5: Run MobileNetV2 Server Locally

The detailed description about web server configuration will be found at Web Server Configuration. Now start serving the model

cd $PROJECT_ROOT_DIR/_bin
./mobilenetv2_classification_server.out ../conf/server/classification/mobilenetv2/mobilenetv2_server_config.ini

Model service will be start at http://localhost:8091 with 4 workers waiting to serve. A demo python client was supplied to test the service

cd $PROJECT_ROOT_DIR/scripts
export PYTHONPATH=$PWD:$PYTHONPATH
python server/test_server.py --server mobilenetv2 --mode single

The client will repeatly post demo images 1000 times. Server output should be like mobilenetv2_server_exam_output Client output should be like mobilenetv2_client_exam_output