Awesome
Awesome-LLM-3D <a href="" target='_blank'><img src="https://visitor-badge.laobi.icu/badge?page_id=activevisionlab.llm3d&left_color=gray&right_color=blue"> </a>
<div align="center"> <img src="assets/Figure1_v6.png" width="100%"> </div>🏠 About
Here is a curated list of papers about 3D-Related Tasks empowered by Large Language Models (LLMs). It contains various tasks including 3D understanding, reasoning, generation, and embodied agents. Also, we include other Foundation Models (CLIP, SAM) for the whole picture of this area.
This is an active repository, you can watch for following the latest advances. If you find it useful, please kindly star ⭐ this repo and cite the paper.
🔥 News
- [2024-05-16] 📢 Check out the first survey paper in the 3D-LLM domain: When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks via Multi-modal Large Language Models
- [2024-01-06] Runsen Xu added chronological information and Xianzheng Ma reorganized it in Z-A order for better following the latest advances.
- [2023-12-16] Xianzheng Ma and Yash Bhalgat curated this list and published the first version;
Table of Content
3D Understanding via LLM
3D Understanding via other Foundation Models
3D Reasoning
Date | keywords | Institute (first) | Paper | Publication | Others |
---|---|---|---|---|---|
2023-5-20 | 3D-CLR | UCLA | 3D Concept Learning and Reasoning from Multi-View Images | CVPR '23 | github |
- | Transcribe3D | TTI, Chicago | Transcribe3D: Grounding LLMs Using Transcribed Information for 3D Referential Reasoning with Self-Corrected Finetuning | CoRL '23 | github |
3D Generation
Date | keywords | Institute | Paper | Publication | Others |
---|---|---|---|---|---|
2023-11-29 | ShapeGPT | Fudan University | ShapeGPT: 3D Shape Generation with A Unified Multi-modal Language Model | Arxiv | github |
2023-11-27 | MeshGPT | TUM | MeshGPT: Generating Triangle Meshes with Decoder-Only Transformers | Arxiv | project |
2023-10-19 | 3D-GPT | ANU | 3D-GPT: Procedural 3D Modeling with Large Language Models | Arxiv | github |
2023-9-21 | LLMR | MIT | LLMR: Real-time Prompting of Interactive Worlds using Large Language Models | Arxiv | - |
2023-9-20 | DreamLLM | MEGVII | DreamLLM: Synergistic Multimodal Comprehension and Creation | Arxiv | github |
2023-4-1 | ChatAvatar | Deemos Tech | DreamFace: Progressive Generation of Animatable 3D Faces under Text Guidance | ACM TOG | website |
3D Embodied Agent
3D Benchmarks
Contributing
Your contributions are always welcome!
I will keep some pull requests open if I'm not sure if they are awesome for 3D LLMs, you could vote for them by adding 👍 to them.
If you have any questions about this opinionated list, please get in touch at xianzheng@robots.ox.ac.uk or Wechat ID: mxz1997112.
Star History
Citation
If you find this repository useful, please consider citing this paper:
@misc{ma2024llmsstep3dworld,
title={When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks via Multi-modal Large Language Models},
author={Xianzheng Ma and Yash Bhalgat and Brandon Smart and Shuai Chen and Xinghui Li and Jian Ding and Jindong Gu and Dave Zhenyu Chen and Songyou Peng and Jia-Wang Bian and Philip H Torr and Marc Pollefeys and Matthias Nießner and Ian D Reid and Angel X. Chang and Iro Laina and Victor Adrian Prisacariu},
year={2024},
journal={arXiv preprint arXiv:2405.10255},
}
Acknowledgement
This repo is inspired by Awesome-LLM