Home

Awesome

Tensorflow 2 Object Detection Training GUI for Linux

Updated for CUDA 11 and Tensorflow 2!!!

This repository allows you to get started with training a State-of-the-art Deep Learning model with little to no configuration needed! You provide your labeled dataset and you can start the training right away and monitor it with TensorBoard. You can even test your model with our built-in Inference REST API. Training with TensorFlow has never been so easy.

You can also use our BMW-Labeltool-lite to label your dataset. The images and labels can be used directly for training.

Prerequisites

Setting Up Project Requirements Automated

This step is recommended to be able to run the solution correctly.

The setup script will check and adjust all the requirements needed based on the user input.

Setting Up Project Requirements Manually

How to check for prerequisites

To check if you have docker-ce installed:

docker --version

To check if you have docker-compose installed:

docker-compose --version

To check if you have nvidia-docker installed:

dpkg -l | grep nvidia-docker

To check your nvidia drivers version, open your terminal and type the command nvidia-smi

Installing Prerequisites

If you are behind a proxy:

Dataset Folder Structure

The following is an example of how a dataset should be structured. Please put all your datasets in the datasets folder.

├──datasets/
    ├──sample_dataset/
        ├── images
        │   ├── img_1.jpg
        │   └── img_2.jpg
        ├── labels
        │   ├── json
        │   │   ├── img_1.json
        │   │   └── img_2.json
        │   └── pascal
        │       ├── img_1.xml
        │       └── img_2.xml
        └── objectclasses.json

PS: you don't need to have both json and pascal folders. Either one is enough

Objectclasses.json file example

You must include in your dataset an objectclasses.json file with a similar structure to the example below:

Midweight and Heavyweight Solution

Midweight: Downloading specific supported online pre-trained weights during the docker image build.<br> To do that, open the json file training_api/assets/networks.json and change the values of the networks you wish to download to true.

Heavyweight (default): Downloading all the supported online pre-trained weights during the docker image build. <br> To do that, open the json file training_api/assets/networks.json and change the value of "select_all" to true.

PS: if you don’t download the weights during the build, you won’t be able to use the network during the training unless you rebuild the solution with the proper network chosen.

All the training are made using the pre-trained network weight based on coco dataset.

Build the Solution

If you wish want to deploy the training workflow in GPU mode, please write the following command from the repository's root directory

docker-compose -f build_gpu.yml build

If you wish want to deploy the training workflow in CPU mode, please write the following command from the repository's root directory

docker-compose -f build_cpu.yml build 

Run the Solution

If you wish want to deploy the training workflow in GPU mode, please write the following command

docker-compose -f run_gpu.yml up

If you wish to deploy the training workflow in CPU mode, please write the following command

docker-compose -f run_cpu.yml up

After a successful run you should see something like the following:

Usage

1- Preparing Dataset

Prepare your dataset for training


2- Specifying General Settings

Specify the general parameters for you docker container


3- Specifying Hyperparameters

Specify the hyperparameters for the training job


4- Specifying Hyperparameters advanced

Specify the advanced hyperparameters for the training job


5- Checking training logs

Check your training logs to get better insights on the progress of the training


6- Monitoring the training

Monitor the training using Tensorboard


7- Checking the status of the job

Check the status to know when the job is completed successfully


8- Downloading and test with Swagger

Download your mode and easily test it with the built-in inference API using Swagger


9- Stopping and Delete the model's container

Delete the container's job to stop an ongoing job or to remove the container of a finished job. (Finished jobs are always available to download)


10- Visualizing graphs and metrics of Deleted Jobs

Visualize graphs and metrics of Deleted Jobs with Tensorboard


Training and Tensorboard Tips

Check our tips document to have (1) (a better insight on training models based on our expertise) and (2) (a benchmark of the inference speed).

Our tensorboard document helps you find your way more easily while navigating tensorboard

Guidelines

<br><br>

Change Docker-sdk default port

To change the docker-sdk default port 2222 to any other port of your choice:

<br><br>

Know Issues

You might face some errors in some cases during the training. Most common ones are:

Citing

If you use this repository in your research, consider citing it using the following Bibtex entry:

@misc{bmwtrainingtool,
  author = {BMW TechOffice MUNICH},
  title = {TensorFlow Training GUI},
  howpublished = {\url{https://github.com/BMW-InnovationLab/BMW-TensorFlow-Training-GUI}},
  year = {2022},
}

Acknowledgments