Home

Awesome

AssistGPT: A General Multi-modal Assistant that can Plan, Execute, Inspect, and Learn

Difei Gao, Lei Ji, Luowei Zhou, Kevin Qinghong Lin, Joya Chen, Zihan Fan, Mike Zheng Shou

Project Website

:rocket: Introduction

overview

AssistGPT can reason in an interleaved language and code format. Given a query input and visual inputs, AssistGPT plans the problem-solving path in language, using structured code to call upon various powerful tools. The Inspector, part of the system, can manage visual inputs and intermediate results, assisting the Planner to invoke tools. Meanwhile, the Learner can assess the reasoning process and collect in-context examples.

:newspaper: News

The code will be released soon.