Home

Awesome

VL-GPT

VL-GPT: A Generative Pre-trained Transformer for Vision and Language Understanding and Generation

Project Termination

We regret to inform that the project (VL-GPT) has been terminated. Unfortunately, the authors Jinguo and Xiaohan left the company and did not manage to refactor the codebase before their checkout. As a result, the source code and weights for this work cannot be released.

However, the main contribution from this work, an image tokenizer with continuous embedding and applying it in Large Multimodal Model, has also been adopted in another project within our team called SEED-X, which has been made open source already. We recommend to refer to the SEED-X project for insights and implementation details.

We sincerely apologize for not being able to release this work as an open-source project. Thank you for your understanding.

Introduction

<div align="center"> <span class="author-block"> <a href="https://scholar.google.com/citations?user=YfHg5lQAAAAJ&hl=en" target="_blank">Jinguo Zhu</a><sup>1*</sup>, </span> <span class="author-block"> <a href="https://dingxiaohan.xyz/" target="_blank">Xiaohan Ding</a><sup>2*</sup>, </span> <span class="author-block"> </span> <a href="https://geyixiao.com/" target="_blank">Yixiao Ge</a><sup>2</sup>, </span> <span class="author-block"> </span> <a href="https://geyuying.github.io/" target="_blank">Yuying Ge</a><sup>2</sup>, </span> </br> <span class="author-block"> <a target="_blank">Sijie Zhao</a><sup>2</sup>, </span> <span class="author-block"> <a href="https://hszhao.github.io/" target="_blank">Hengshuang Zhao</a><sup>3</sup>, </span> <span class="author-block"> <a href="https://gr.xjtu.edu.cn/web/xhw" target="_blank">Xiaohua Wang</a><sup>1</sup>, </span> <span class="author-block"> <a href="https://scholar.google.com/citations?user=4oXBp9UAAAAJ&hl=en&oi=ao" target="_blank">Ying Shan</a><sup>2</sup> </span> </div> <div align="center"> <sup>1</sup> <a target='_blank'>Xi'an Jiaotong University</a> <sup>2</sup> <a href='https://ai.tencent.com/' target='_blank'>Tencent AI Lab</a> <sup>3</sup> <a target='_blank'>The University of Hong Kong</a>&emsp; </br> <sup>*</sup> Equal Contribution&emsp; </div>

<a href="https://arxiv.org/abs/2312.09251"><img src="https://img.shields.io/badge/Paper-PDF-orange"></a> <a href="#LICENSE--citation"> <img alt="License: Apache2.0" src="https://img.shields.io/badge/LICENSE-Apache%202.0-blue.svg"/> </a>

<p align="center" width="100%"> <img src="assets/overview.png" width="100%" height="60%"> </p>

License

This project is released under the Apache 2.0 license. Please see the LICENSE file for more information.