Awesome
<div align="center">
<h1> VisionLLM Series </h1>
</div>
- VisionLLM: Large Language Model as Open-Ended Decoder for Vision-Centric Tasks (NIPS2023)
- VisionLLM v2: A Generalist Multimodal Large Language Model for Hundeds of Vision-Language Tasks (NIPS2024)
<br>
<img src='VisionLLMv2/assets/arch.png' align="center" width="100%">
🚀 News
2024/06
: We release VisionLLM v2, which is a generalist multimodal large language model to support hundres of vision-language tasks, covering visual understanding, perception and generation.