Awesome
AssistGPT: A General Multi-modal Assistant that can Plan, Execute, Inspect, and Learn
Difei Gao, Lei Ji, Luowei Zhou, Kevin Qinghong Lin, Joya Chen, Zihan Fan, Mike Zheng Shou
:rocket: Introduction
AssistGPT can reason in an interleaved language and code format. Given a query input and visual inputs, AssistGPT plans the problem-solving path in language, using structured code to call upon various powerful tools. The Inspector, part of the system, can manage visual inputs and intermediate results, assisting the Planner to invoke tools. Meanwhile, the Learner can assess the reasoning process and collect in-context examples.
:newspaper: News
The code will be released soon.