Home

Awesome

<div align="center"> <p> <a align="center" target="_blank"> <img width="900" src="./images/MiVOLO.jpg"></a> </p> <br> </div>

MiVOLO: Multi-input Transformer for Age and Gender Estimation

PWC PWC PWC PWC PWC PWC PWC PWC PWC PWC

MiVOLO: Multi-input Transformer for Age and Gender Estimation, Maksim Kuprashevich, Irina Tolstykh, 2023 arXiv 2307.04616

Beyond Specialization: Assessing the Capabilities of MLLMs in Age and Gender Estimation, Maksim Kuprashevich, Grigorii Alekseenko, Irina Tolstykh 2024 arXiv 2403.02302

[Paper 2023] [Paper 2024] [Demo] [Telegram Bot] [BibTex] [Data]

MiVOLO pretrained models

Gender & Age recognition performance.

<table style="margin: auto"> <tr> <th align="left">Model</th> <th align="left" style="color:LightBlue">Type</th> <th align="left">Dataset (train and test)</th> <th align="left">Age MAE</th> <th align="left">Age CS@5</th> <th align="left">Gender Accuracy</th> <th align="left">download</th> </tr> <tr> <td>volo_d1</td> <td align="left">face_only, age</td> <td align="left">IMDB-cleaned</td> <td align="left">4.29</td> <td align="left">67.71</td> <td align="left">-</td> <td><a href="https://drive.google.com/file/d/17ysOqgG3FUyEuxrV3Uh49EpmuOiGDxrq/view?usp=drive_link">checkpoint</a></td> </tr> <tr> <td>volo_d1</td> <td align="left">face_only, age, gender</td> <td align="left">IMDB-cleaned</td> <td align="left">4.22</td> <td align="left">68.68</td> <td align="left">99.38</td> <td><a href="https://drive.google.com/file/d/1NlsNEVijX2tjMe8LBb1rI56WB_ADVHeP/view?usp=drive_link">checkpoint</a></td> </tr> <tr> <td>mivolo_d1</td> <td align="left">face_body, age, gender</td> <td align="left">IMDB-cleaned</td> <td align="left">4.24 [face+body]<br>6.87 [body]</td> <td align="left">68.32 [face+body]<br>46.32 [body]</td> <td align="left">99.46 [face+body]<br>96.48 [body]</td> <td><a href="https://drive.google.com/file/d/11i8pKctxz3wVkDBlWKvhYIh7kpVFXSZ4/view?usp=drive_link">model_imdb_cross_person_4.24_99.46.pth.tar</a></td> </tr> <tr> <td>volo_d1</td> <td align="left">face_only, age</td> <td align="left">UTKFace</td> <td align="left">4.23</td> <td align="left">69.72</td> <td align="left">-</td> <td><a href="https://drive.google.com/file/d/1LtDfAJrWrw-QA9U5IuC3_JImbvAQhrJE/view?usp=drive_link">checkpoint</a></td> </tr> <tr> <td>volo_d1</td> <td align="left">face_only, age, gender</td> <td align="left">UTKFace</td> <td align="left">4.23</td> <td align="left">69.78</td> <td align="left">97.69</td> <td><a href="https://drive.google.com/file/d/1hKFmIR6fjHMevm-a9uPEAkDLrTAh-W4D/view?usp=drive_link">checkpoint</a></td> </tr> <tr> <td>mivolo_d1</td> <td align="left">face_body, age, gender</td> <td align="left">Lagenda</td> <td align="left">3.99 [face+body]</td> <td align="left">71.27 [face+body]</td> <td align="left">97.36 [face+body]</td> <td><a href="https://huggingface.co/spaces/iitolstykh/demo">demo</a></td> </tr> <tr> <td>mivolov2_d1_384x384</td> <td align="left">face_body, age, gender</td> <td align="left">Lagenda</td> <td align="left">3.65 [face+body]</td> <td align="left">74.48 [face+body]</td> <td align="left">97.99 [face+body]</td> <td><a href="https://t.me/AnyAgeBot">telegram bot</a></td> </tr> </table>

MiVOLO regression benchmarks

Gender & Age recognition performance.

Use valid_age_gender.sh to reproduce results with our checkpoints.

<table style="margin: auto"> <tr> <th align="left">Model</th> <th align="left" style="color:LightBlue">Type</th> <th align="left">Train Dataset</th> <th align="left">Test Dataset</th> <th align="left">Age MAE</th> <th align="left">Age CS@5</th> <th align="left">Gender Accuracy</th> <th align="left">download</th> </tr> <tr> <td>mivolo_d1</td> <td align="left">face_body, age, gender</td> <td align="left">Lagenda</td> <td align="left">AgeDB</td> <td align="left">5.55 [face]</td> <td align="left">55.08 [face]</td> <td align="left">98.3 [face]</td> <td><a href="https://huggingface.co/spaces/iitolstykh/demo">demo</a></td> </tr> <tr> <td>mivolo_d1</td> <td align="left">face_body, age, gender</td> <td align="left">IMDB-cleaned</td> <td align="left">AgeDB</td> <td align="left">5.58 [face]</td> <td align="left">55.54 [face]</td> <td align="left">97.93 [face]</td> <td><a href="https://drive.google.com/file/d/11i8pKctxz3wVkDBlWKvhYIh7kpVFXSZ4/view?usp=drive_link">model_imdb_cross_person_4.24_99.46.pth.tar</a></td> </tr> </table>

MiVOLO classification benchmarks

Gender & Age recognition performance.

<table style="margin: auto"> <tr> <th align="left">Model</th> <th align="left" style="color:LightBlue">Type</th> <th align="left">Train Dataset</th> <th align="left">Test Dataset</th> <th align="left">Age Accuracy</th> <th align="left">Gender Accuracy</th> </tr> <tr> <td>mivolo_d1</td> <td align="left">face_body, age, gender</td> <td align="left">Lagenda</td> <td align="left">FairFace</td> <td align="left">61.07 [face+body]</td> <td align="left">95.73 [face+body]</td> </tr> <tr> <td>mivolo_d1</td> <td align="left">face_body, age, gender</td> <td align="left">Lagenda</td> <td align="left">Adience</td> <td align="left">68.69 [face]</td> <td align="left">96.51[face]</td> </tr> <tr> <td>mivolov2_d1_384</td> <td align="left">face_body, age, gender</td> <td align="left">Lagenda</td> <td align="left">Adience</td> <td align="left">69.43 [face]</td> <td align="left">97.39[face]</td> </tr> </table>

Dataset

Please, cite our papers if you use any this data!

Install

Install pytorch 1.13+ and other requirements.

pip install -r requirements.txt
pip install .

Demo

  1. Download body + face detector model to models/yolov8x_person_face.pt
  2. Download mivolo checkpoint to models/mivolo_imbd.pth.tar
wget https://variety.com/wp-content/uploads/2023/04/MCDNOHA_SP001.jpg -O jennifer_lawrence.jpg

python3 demo.py \
--input "jennifer_lawrence.jpg" \
--output "output" \
--detector-weights "models/yolov8x_person_face.pt " \
--checkpoint "models/mivolo_imbd.pth.tar" \
--device "cuda:0" \
--with-persons \
--draw

To run demo for a youtube video:

python3 demo.py \
--input "https://www.youtube.com/shorts/pVh32k0hGEI" \
--output "output" \
--detector-weights "models/yolov8x_person_face.pt" \
--checkpoint "models/mivolo_imbd.pth.tar" \
--device "cuda:0" \
--draw \
--with-persons

Validation

To reproduce validation metrics:

  1. Download prepared annotations for imbd-clean / utk / adience / lagenda / fairface.
  2. Download checkpoint
  3. Run validation:
python3 eval_pretrained.py \
  --dataset_images /path/to/dataset/utk/images \
  --dataset_annotations /path/to/dataset/utk/annotation \
  --dataset_name utk \
  --split valid \
  --batch-size 512 \
  --checkpoint models/mivolo_imbd.pth.tar \
  --half \
  --with-persons \
  --device "cuda:0"

Supported dataset names: "utk", "imdb", "lagenda", "fairface", "adience".

Changelog

CHANGELOG.md

ONNX and TensorRT export

As of now (11.08.2023), while ONNX export is technically feasible, it is not advisable due to the poor performance of the resulting model with batch processing. TensorRT and OpenVINO export is impossible due to its lack of support for col2im.

If you remain absolutely committed to utilizing ONNX export, you can refer to these instructions.

The most highly recommended export method at present is using TorchScript. You can achieve this with a single line of code:

torch.jit.trace(model)

This approach provides you with a model that maintains its original speed and only requires a single file for usage, eliminating the need for additional code.

License

Please, see here

Citing

If you use our models, code or dataset, we kindly request you to cite the following paper and give repository a :star:

@article{mivolo2023,
   Author = {Maksim Kuprashevich and Irina Tolstykh},
   Title = {MiVOLO: Multi-input Transformer for Age and Gender Estimation},
   Year = {2023},
   Eprint = {arXiv:2307.04616},
}
@article{mivolo2024,
   Author = {Maksim Kuprashevich and Grigorii Alekseenko and Irina Tolstykh},
   Title = {Beyond Specialization: Assessing the Capabilities of MLLMs in Age and Gender Estimation},
   Year = {2024},
   Eprint = {arXiv:2403.02302},
}