Awesome
Myriad: A Large Multimodal Model Applying Vision Experts for Industrial Anomaly Detection.
Myriad: A Large Multimodal Model Applying Vision Experts for Industrial Anomaly Detection. [Paper] [[HF](coming soon)] <br> Yuanze Li, Haolin Wang, Shihao Yuan, Ming Liu, Debin Zhao, Yiwen Guo, Chen Xu, Guangming Shi, Wangmeng Zuo
<!--p align="center"> <a href="https://llava.hliu.cc/"><img src="images/llava_logo.png" width="50%"></a> <br> Generated by <a href="https://gligen.github.io/">GLIGEN</a> via "a cute lava llama with glasses" and box prompt </p-->TODO
- upload Myriad pre-trained weights.
- update evaluation guidance.
- update training guidance.
Contents
Install
Coming soon.
Myriad Weights
Myriad Weights are coming soon.
Demo
The demo code is coming soon.
Train
Training code is already in the repository. The two stage training guidance will be updated.
Evaluation
In Myriad, we evaluate models on the public benchmark for Anomaly Detection, MVTec and VisA. To ensure the reproducibility, we evaluate the models with greedy decoding. We do not evaluate using beam search.
Evaluate code is public now, and the guidance is coming soon in few days.
Citation
If you find Myriad useful for your research and applications, please cite using this BibTeX:
@article{Myriad,
title={Myriad: Large multimodal model by applying vision experts for industrial anomaly detection},
author={Li, Yuanze and Wang, Haolin and Yuan, Shihao and Liu, Ming and Zhao, Debin and Guo, Yiwen and Xu, Chen and Shi, Guangming and Zuo, Wangmeng},
journal={arXiv preprint arXiv:2310.19070},
year={2023}
}
Acknowledgement
- MiniGPT-4: the codebase we built upon, and our base model MiniGPT4-v1. Thanks for their clear code base and help for reproduce!