Awesome
Personal LLM Agents - Survey
This repo maintains a curated list of papers related to Personal LLM Agents. For more details, please refer to our paper or join our discussion group:
Personal LLM Agents: Insights and Survey about the Capability, Efficiency and Security
Yuanchun Li, Hao Wen, Weijun Wang, Xiangyu Li, Yizhen Yuan, Guohong Liu, Jiacheng Liu, Wenxing Xu, Xiang Wang, Yi Sun, Rui Kong, Yile Wang, Hanfei Geng, Jian Luan, Xuefeng Jin, Zilong Ye, Guanjing Xiong, Fan Zhang, Xiang Li, Mengwei Xu, Zhijun Li, Peng Li, Yang Liu, Ya-Qin Zhang, Yunxin Liu
Personal LLM Agents are defined as a special type of LLM-based agents that are deeply integrated with personal data, personal devices, and personal services. They are perferably deployed to resource-constrained mobile/edge devices and/or powered by lightweight AI models. The main purpose of personal LLM agents is to assist end-users and augment their abilities, helping them to focus more and do better on interesting and important affairs.
This paper list covers several main aspects of Personal LLM Agents, including the capabilities, efficiency and security. Table of content:
- Personal LLM Agents - Survey
Key Capabilities of Personal LLM Agents
Task Automation
Task automation is a core capability of personal LLM agents, which determines how well the agent can respond to user commands and/or automatically execute tasks for the user.
We focus on UI-based task automation agents in this list due to their popularity and close relevance to personal devices.
UI-grounded Agents for Task Automation
LLM-based Approaches
- WebGPT: Browser-assisted question-answering with human feedback. [paper]
- Enabling Conversational Interaction with Mobile UI Using Large Language Models. [CHI 2023] [paper]
- Language Models can Solve Computer Tasks. [NeurIPS 2023] [paper]
- DroidBot-GPT: GPT-powered UI Automation for Android. [arxiv] [code]
- Responsible Task Automation: Empowering Large Language Models as Responsible Task Automators.[paper]
- Mind2Web: Towards a Generalist Agent for the Web. arxiv 2023 [paper][code][code]
- AutoDroid: LLM-powered Task Automation in Android. [paper] [code]
- You Only Look at Screens: Multimodal Chain-of-Action Agents. ArXiv Preprint [paper] [code]
- AXNav: Replaying Accessibility Tests from Natural Language. [paper]
- Automatic Macro Mining from Interaction Traces at Scale. [paper]
- A Zero-Shot Language Agent for Computer Control with Structured Reflection. [paper]
- Reinforced UI Instruction Grounding: Towards a Generic UI Task Automation API. [paper]
- GPT-4V in Wonderland: Large Multimodal Models for Zero-Shot Smartphone GUI Navigation. [paper][code]
- UGIF: UI Grounded Instruction Following. [paper]
- Explore, Select, Derive, and Recall: Augmenting LLM with Human-like Memory for Mobile Task Automation. [paper][code]
- CogAgent: A Visual Language Model for GUI Agents. [paper][code]
- From Pixels to UI Actions: Learning to Follow Instructions via Graphical User Interfaces [paper][code]
- AppAgent: Multimodal Agents as Smartphone Users. [paper][code]
- GPT-4V(ision) is a Generalist Web Agent, if Grounded [paper][code]
- SeeClick: Harnessing GUI Grounding for Advanced Visual GUI Agents [paper][code]
- Octopus v2: On-device language model for super agent [paper][code]
- Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs [paper]
- Octopus v3: Technical Report for On-device Sub-billion Multimodal AI Agent [paper]
Traditional Approaches
- uLink: Enabling User-Defined Deep Linking to App Content. [Mobisys 2016]
- SUGILITE: Creating Multimodal Smartphone Automation by Demonstration. [CHI 2017] [paper][code]
- Programming IoT devices by demonstration using mobile apps. [IS-EUD 2017]
- Kite: Building Conversational Bots from Mobile Apps. [MobiSys 2018]. [paper]
- Reinforcement Learning on Web Interfaces using Workflow-Guided Exploration. [ICLR 2018]. [paper][code]
- Mapping Natural Language Instructions to Mobile UI Action Sequences. [ACL 2020] [paper][code]
- Glider: A Reinforcement Learning Approach to Extract UI Scripts from Websites. [SIGIR 2021] [paper]
- UIBert: Learning Generic Multimodal Representations for UI Understanding. [IJCAI-21] [paper]
- META-GUI: Towards Multi-modal Conversational Agents on Mobile GUI. [EMNLP 2022][paper][code]
- UINav: A maker of UI automation agents. [paper]
Benchmarks of UI Automation
- Mapping natural language commands to web elements. [EMNLP 2018] [paper][code]
- UIBert: Learning Generic Multimodal Representations for UI Understanding. [IJCAI-21] [paper]
- Mapping Natural Language Instructions to Mobile UI Action Sequences. [ACL 2020] [paper][code]
- A Dataset for Interactive Vision Language Navigation with Unknown Command Feasibility. [ECCV 2022][paper] [code]
- META-GUI: Towards Multi-modal Conversational Agents on Mobile GUI. [EMNLP 2022][paper][code]
- UGIF: UI Grounded Instruction Following. [paper]
- ASSISTGUI: Task-Oriented Desktop Graphical User Interface Automation. [paper][code]
- Mind2Web: Towards a Generalist Agent for the Web. arxiv 2023 [paper][code][code]
- Android in the Wild: A Large-Scale Dataset for Android Device Control. [paper][code]
- Empowering LLM to use Smartphone for Intelligent Task Automation. [paper] [code]
- World of Bits: An Open-Domain Platform for Web-Based Agents. [ICML 2017] [paper][code]
- Reinforcement Learning on Web Interfaces using Workflow-Guided Exploration. [ICLR 2018]. [paper][code]
- WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents. [NeurIPS 2022] [paper]
- AndroidEnv: A Reinforcement Learning Platform for Android [paper][code]
- Mobile-Env: An Evaluation Platform and Benchmark for Interactive Agents in LLM Era. [paper][code]
- WebArena: A Realistic Web Environment for Building Autonomous Agents. [paper][code]
- OmniACT: A Dataset and Benchmark for Enabling Multimodal Generalist Autonomous Agents for Desktop and Web. [paper][code]
- AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent [paper][code]
- VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding? [paper][code]
- ScreenAgent: A Vision Language Model-driven Computer Control Agent [paper][code]
- AgentStudio: A Toolkit for Building General Virtual Agents [paper][code]
- OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments [paper][code]
Sensing
The ability to understand the current context is crucial for Personal LLM Agents to offer personalized, context-aware services. This include the techniques to sense the user activity, mental status, environment dynamics, etc.
LLM-based Approaches
- “Automated Mobile Sensing Strategies Generation for Human Behaviour Understanding” (Gao et al., 2023, p. 521) arxiv
- “Cue-CoT: Chain-of-thought Prompting for Responding to In-depth Dialogue Questions with LLMs” (Wang et al., 2023, p. 1) EMNLP 2023
- “Exploring Large Language Models for Human Mobility Prediction under Public Events” (Liang et al., 2023, p. 1) arxiv
- “Penetrative AI: Making LLMs Comprehend the Physical World” (Xu et al., 2023, p. 1) arxiv
- “Evaluating Subjective Cognitive Appraisals of Emotions from Large Language Models” (Zhan et al., 2023, p. 1) arxiv
- “PALR: Personalization Aware LLMs for Recommendation” (Yang et al., 2023, p. 1) arxiv
- “Sentiment Analysis through LLM Negotiations” (Sun et al., 2023, p. 1) arxiv
- “Bridging the Information Gap Between Domain-Specific Model and General LLM for Personalized Recommendation” (Zhang et al., 2023, p. 1) arxiv
- “Conversational Health Agents: A Personalized LLM-Powered Agent Framework” (Abbasian et al., 2023, p. 1) arxiv
Traditional Approaches
-
“Afective State Prediction from Smartphone Touch and Sensor Data in the Wild” (Wampfler et al., 2022, p. 1) CHI'22
-
“Mobile Localization Techniques for Wireless Sensor Networks: Survey and Recommendations” (Oliveira et al., 2023, p. 361) ACM Transactions on Sensor Networks
-
“Are You Killing Time? Predicting Smartphone Users’ Time-killing Moments via Fusion of Smartphone Sensor Data and Screenshots” (Chen et al., 2023, p. 1) CHI'23
-
“Remote Breathing Rate Tracking in Stationary Position Using the Motion and Acoustic Sensors of Earables” (Ahmed et al., 2023, p. 1) CHI'23
-
“SAMoSA: Sensing Activities with Motion and Subsampled Audio” (Mollyn et al., 2022, p. 1321) IMWUT
-
“A Systematic Survey on Android API Usage for Data-Driven Analytics with Smartphones” (Lee et al., 2023, p. 1) ACM Computing Surveys
-
“A Multi-Sensor Approach to Automatically Recognize Breaks and Work Activities of Knowledge Workers in Academia” (Di Lascio et al., 2020, p. 781) IMWUT
-
“Robust Inertial Motion Tracking through Deep Sensor Fusion across Smart Earbuds and Smartphone” (Gong et al., 2021, p. 621) IMWUT
-
“DancingAnt: Body-empowered Wireless Sensing Utilizing Pervasive Radiations from Powerline” (Cui et al., 2023, p. 873) ACM MobiCom'23
-
“DeXAR: Deep Explainable Sensor-Based Activity Recognition in Smart-Home Environments” (Arrotta et al., 2022, p. 11) IMWUT
-
“MUSE-Fi: Contactless MUti-person SEnsing Exploiting Near-field Wi-Fi Channel Variation” (Hu et al., 2023, p. 1135) IMWUT
-
“SenCom: Integrated Sensing and Communication with Practical WiFi” (He et al., 2023, p. 903) ACM MobiCom'23
-
“SleepMore: Inferring Sleep Duration at Scale via Multi-Device WiFi Sensing” (Zakaria et al., 2022, p. 1931) IMWUT
-
“COCOA: Cross Modality Contrastive Learning for Sensor Data” (Deldari et al., 2022, p. 1081) ACM MobiCom'23
-
“M3Sense: Affect-Agnostic Multitask Representation Learning Using Multimodal Wearable Sensors” (Samyoun et al., 2022, p. 731) IMWUT
-
“Predicting Subjective Measures of Social Anxiety from Sparsely Collected Mobile Sensor Data” (Rashid et al., 2020, p. 1091) IMWUT
-
“Attend and Discriminate: Beyond the State-of-the-Art for Human Activity Recognition Using Wearable Sensors” (Abedin et al., 2021, p. 11) IMWUT
-
“Fall Detection based on Interpretation of Important Features with Wrist-Wearable Sensors” (Kim et al., 2022, p. 1) IMWUT
-
“PowerPhone: Unleashing the Acoustic Sensing Capability of Smartphones” (Cao et al., 2023, p. 842) ACM MobiCom'23
-
“I Spy You: Eavesdropping Continuous Speech on Smartphones via Motion Sensors” (Zhang et al., 2022, p. 1971) IMWUT
-
“Watching Your Phone’s Back: Gesture Recognition by Sensing Acoustical Structure-borne Propagation” (Wang et al., 2021, p. 821) IMWUT
-
“Gesture Recognition Method Using Acoustic Sensing on Usual Garment” (Amesaka et al., 2022, p. 411) IMWUT
- “Complex Daily Activities, Country-Level Diversity, and Smartphone Sensing: A Study in Denmark, Italy, Mongolia, Paraguay, and UK” (Assi et al., 2023, p. 1) CHI'23
- “Generalization and Personalization of Mobile Sensing-Based Mood Inference Models: An Analysis of College Students in Eight Countries” (Meegahapola et al., 2022, p. 1761) IMWUT
- “Detecting Social Contexts from Mobile Sensing Indicators in Virtual Interactions with Socially Anxious Individuals” (Wang et al., 2023, p. 1341) IMWUT
- “Examining the Social Context of Alcohol Drinking in Young Adults with Smartphone Sensing” (Meegahapola et al., 2021, p. 1211) IMWUT
- “Towards Open-Domain Twitter User Profile Inference” (Wen et al., 2023, p. 3172) ACL 2023
- “One More Bite? Inferring Food Consumption Level of College Students Using Smartphone Sensing and Self-Reports” (Meegahapola et al., 2021, p. 261) IMWUT
- “FlowSense: Monitoring Airflow in Building Ventilation Systems Using Audio Sensing” (Chhaglani et al., 2022, p. 51) IMWUT
- “MicroCam: Leveraging Smartphone Microscope Camera for Context-Aware Contact Surface Sensing” (Hu et al., 2023, p. 981) IMWUT
-
“A Multi-Sensor Approach to Automatically Recognize Breaks and Work Activities of Knowledge Workers in Academia” (Di Lascio et al., 2020, p. 781) IMWUT
-
Mobile and Wearable Sensing Frameworks for mHealth Studies and Applications: A Systematic Review” (Kumar et al., 2021, p. 81) ACM Transaction on Computing for Healthcare
-
“Afective State Prediction from Smartphone Touch and Sensor Data in the Wild” (Wampfler et al., 2022, p. 1) CHI'22
-
“Are You Killing Time? Predicting Smartphone Users’ Time-killing Moments via Fusion of Smartphone Sensor Data and Screenshots” (Chen et al., 2023, p. 1) CHI'23
-
“FeverPhone: Accessible Core-Body Temperature Sensing for Fever Monitoring Using Commodity Smartphones” (Breda et al., 2022, p. 31) IMWUT
-
“Guard Your Heart Silently: Continuous Electrocardiogram Waveform Monitoring with Wrist-Worn Motion Sensor” (Cao et al., 2022, p. 1031) IMWUT
-
“Listen2Cough: Leveraging End-to-End Deep Learning Cough Detection Model to Enhance Lung Health Assessment Using Passively Sensed Audio” (Xu et al., 2021, p. 431) IMWUT
-
“HealthWalks: Sensing Fine-grained Individual Health Condition via Mobility Data” (Lin et al., 2020, p. 1381) IMWUT
-
“Identifying Mobile Sensing Indicators of Stress-Resilience” (Adler et al., 2021, p. 511) IMWUT
-
“MoodExplorer: Towards Compound Emotion Detection via Smartphone Sensing” (Zhang et al., 2018, p. 1761) IMWUT
-
“mTeeth: Identifying Brushing Teeth Surfaces Using Wrist-Worn Inertial Sensors” (Akther et al., 2021, p. 531) IMWUT
-
“Detecting Job Promotion in Information Workers Using Mobile Sensing” (Nepal et al., 2020, p. 1131) IMWUT
-
“First-Gen Lens: Assessing Mental Health of First-Generation Students across Their First Year at College Using Mobile Sensing” (Wang et al., 2022, p. 951) IMWUT
-
“Predicting Personality Traits from Physical Activity Intensity” (Gao et al., 2019, p. 1) IEEE Computer
-
“Predicting Symptom Trajectories of Schizophrenia using Mobile Sensing” (Wang et al., 2017, p. 1101) IMWUT
-
“Predictors of Life Satisfaction based on Daily Activities from Mobile Sensor Data” (Yürüten et al., 2014, p. 1) CHI'14
-
“SmartGPA: How Smartphones Can Assess and Predict Academic Performance of College Students” (Wang et al., 2015, p. 1) UbiComp'15
-
“Social Sensing: Assessing Social Functioning of Patients Living with Schizophrenia using Mobile Phone Sensing” (Wang et al., 2020, p. 1) CHI'20
-
“SmokingOpp: Detecting the Smoking ‘Opportunity’ Context Using Mobile Sensors” (Chatterjee et al., 2020, p. 41) IMWUT
Memorization
Memorization is about the ability of Personal LLM Agents to maintain information about the user, so that the agents can provide more customized services and evolve themselves according to user preferences.
Memory Obtaining
- “LifeLogging: Personal Big Data” Foundations and Trends in information retrieval
- “Vision-based human activity recognition: a survey” Multimedia Tools and Applications
- “Predicting personality from patterns of behavior collected with smartphones” Proceedings of the National Academy of Sciences
- “Facial Emotion Detection Using Deep Learning” 2020 international conference for emerging technology (INCET)
- “Emotion detection of textual data: An interdisciplinary survey” 2021 IEEE World AI IoT Congress
Memory Management
- “Privacystreams: Enabling transparency in personal data processing for mobile apps” IMWUT
- “Tree of Thoughts: Deliberate Problem Solving with Large Language Models” arxiv
- “Chain-of-Thought Prompting Elicits Reasoning in Large Language Models” Advances in Neural Information Processing Systems
- “ReAct: Synergizing Reasoning and Acting in Language Models” arxiv
- “Generative Agents: Interactive Simulacra of Human Behavior” Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology
- “Show Your Work: Scratchpads for Intermediate Computation with Language Models” arxiv
- “Cognitive Architectures for Language Agents” arxiv
Agent Self-evolution
- “DreamCoder: growing generalizable, interpretable knowledge with wake–sleep Bayesian program learning” Proceedings of the 42nd acm sigplan international conference on programming language design and implementation
- “Voyager: An Open-Ended Embodied Agent with Large Language Models” arxiv
- “Language models as zero-shot planners: Extracting actionable knowledge for embodied agents” International Conference on Machine Learning
- “Bootstrap Your Own Skills: Learning to Solve New Tasks with Large Language Model Guidance” arxiv
- “FireAct: Toward Language Agent Fine-tuning” arxiv
Efficiency of LLM Agents
The efficiency of LLM agents is closely related to the efficiency of LLM inference, LLM training/customization, and memory management.
Efficient LLM Inference & Training
LLM inference/training efficiency has been comprehensively summarized in existing surveys (e.g. this link). Therefore, we omit this part in this list.
Efficient Memory Retrieval & Management
Here we mainly list the papers related to the efficiency memory management, an important component of LLM-based agents.
Organizing the Memory
(with vector library, vector DB, and others)
Vector Library
- RETRO: Improving language models by retrieving from trillions of tokens. [ICML, 2021] [paper]
- RETA-LLM: A Retrieval-Augmented Large Language Model Toolkit. [arXiv, 2023] [paper] [code]
- TRIME: Training Language Models with Memory Augmentation. [EMNLP, 2022] [paper] [code]
- Enhancing LLM Intelligence with ARM-RAG: Auxiliary Rationale Memory for Retrieval Augmented Generation. [arXiv, 2023] [paper] [code]
Vector Database
- Survey of Vector Database Management Systems. [arXiv, 2023] [paper]
- Vector database management systems: Fundamental concepts, use-cases, and current challenges. [arXiv, 2023] [paper]
- A Comprehensive Survey on Vector Database: Storage and Retrieval Technique, Challenge. [arXiv, 2023] [paper]
Other Forms of Memory
- Memorizing Transformers. [ICLR, 2022] [paper] [code]
- RET-LLM: Towards a General Read-Write Memory for Large Language Models. [arXiv, 2023] [paper]
Optimizing the Efficiency of Memory
Searching Design
- Milvus: A purpose-built vector data management system. [SIGMOD, 2021] [paper(https://dl.acm.org/doi/10.1145/3448016.3457550)] [code]
- Analyticdb-v: A hybrid analytical engine towards query fusion for structured and unstructured data. [Proceedings of the VLDB Endowment, Volume 13, Issue 12, pp 3152–3165] [paper]
- Hqann: Efficient and robust similarity search for hybrid queries with structured and unstructured constraints. [CIKM, 2022] [paper]
- Qdrant [github]
Searching Execution
- Faiss:Facebook AI Similarity Search. [wiki] [code]
- Milvus: A purpose-built vector data management system. [SIGMOD, 2021] [paper] [code]
- Quicker ADC : Unlocking the Hidden Potential of Product Quantization With SIMD. [IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019] [paper] [code]
Efficient Indexing
- LSH: Locality-sensitive hashing scheme based on p-stable distributions. [SCG, 2004] [paper]
- Random projection trees and low dimensional manifolds. [STOC, 2008] [paper]
- SPANN: Highly-efficient Billion-scale Approximate Nearest Neighborhood Search. [NeurIPS, 2021] [paper] [code]
- Efficient and Robust Approximate Nearest Neighbor Search Using Hierarchical Navigable Small World Graphs. [IEEE Transactions on Pattern Analysis and Machine Intelligence, VOL. 42, NO. 4, 2020] [paper]
- DiskANN: Fast Accurate Billion-point Nearest Neighbor Search on a Single Node. [NeurIPS, 2019] [paper] [code]
- DiskANN++: Efficient Page-based Search over Isomorphic Mapped Graph Index using Query-sensitivity Entry Vertex. [arXiv, 2023] [paper]
- CXL-ANNS: Software-Hardware Collaborative Memory Disaggregation and Computation for Billion-Scale Approximate Nearest Neighbor Search. [USENIX ATC, 2023] [paper]
- Co-design Hardware and Algorithm for Vector Search. [SC, 2023] [paper] [code]
Security & Privacy of Personal LLM Agents
Security & Privacy of AI/ML is a huge area with lots of related papers. Here we only focus on the ones related to LLM and LLM agents.
Confidentiality (of User Data)
- THE-X: Privacy-Preserving Transformer Inference with Homomorphic Encryption. [ACL, 2022][paper]
- TextFusion: Privacy-Preserving Pre-trained Model Inference via Token Fusion [EMNLP, 2022] [paper][code]
- TextObfuscator: Making Pre-trained Language Model a Privacy Protector via Obfuscating Word Representations. [ACL, 2023] [paper][code]
- Adversarial Training for Large Neural Language Models. [arXiv, 2020] [paper][code]
Integrity (of Agent Behavior)
Adversarial Attacks
- Certifying LLM Safety against Adversarial Prompting. [arXiv, 2023] [paper][code]
- On evaluating adversarial robustness of large vision-language models. [arXiv, 2023] [paper][code]
- Jailbroken: How does llm safety training fail? [arXiv, 2023] [paper]
- On the adversarial robustness of multi-modal foundation models. [arXiv, 2023] [paper]
- Misusing Tools in Large Language Models With Visual Adversarial Examples. [arXiv, 2023] [paper]
- Jailbreak in pieces: Compositional Adversarial Attacks on Multi-Modal Language Models. [arXiv, 2023] [paper]
Backdoor Attacks
- Backdoor attacks for in-context learning with language models. [arXiv, 2023] [paper]
- Prompt as Triggers for Backdoor Attack: Examining the Vulnerability in Language Models. [arXiv, 2023] [paper]
- PoisonPrompt: Backdoor Attack on Prompt-based Large Language Models. [arXiv, 2023] [paper][code]
- Defending against backdoor attacks in natural language generation. [arXiv, 2021] [paper][code]
Prompt Injection Attacks
- Not What You've Signed Up For: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection. [arXiv, 2023] [paper]
- Ignore Previous Prompt: Attack Techniques For Language Models. [arXiv, 2022] [paper][code]
- Prompt Injection attack against LLM-integrated Applications. [arXiv, 2023] [paper][code]
- Jailbreaking Black Box Large Language Models in Twenty Queries. [arXiv, 2023] [paper][code]
- Extracting Training Data from Large Language Models. [arXiv, 2020] [paper]
- SmoothLLM: Defending Large Language Models Against Jailbreaking Attacks. [arXiv, 2023] [paper][code]
Reliability (of Agent Decisions)
Problems
- Survey of Hallucination in Natural Language Generation. [ACM Computing Surveys 2023] [paper]
- A Survey of Hallucination in Large Foundation Models. [arXiv, 2023] [paper]
- DERA: Enhancing Large Language Model Completions with Dialog-Enabled Resolving Agents. [arXiv, 2023] [paper]
- Cumulative Reasoning with Large Language Models. [arXiv, 2023] [paper]
- Learning From Mistakes Makes LLM Better Reasoner. [arXiv, 2023] [paper]
- Large Language Models can Learn Rules. [arXiv, 2023] [paper]
Improvement
- PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts. [ACL 2022] [paper]
- Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks. [EMNLP 2022] [paper]
- Finetuned Language Models are Zero-Shot Learners. [ICLR 2022] [paper]
- SELFCHECKGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models. [EMNLP 2023] [paper]
- Large Language Models Can Self-Improve. [arXiv, 2022] [paper]
- Self-Refine: Iterative Refinement with Self-Feedback. [arXiv, 2023] [paper]
- Teaching Large Language Models to Self-Debug. [arXiv, 2023] [paper]
- Prompt-Guided Retrieval Augmentation for Non-Knowledge-Intensive Tasks. [ACL 2023] [paper]
- Chain-of-Note: Enhancing Robustness in Retrieval-Augmented Language Models. [arXiv, 2023] [paper]
- Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection. [arXiv, 2023] [paper]
- Self-Knowledge Guided Retrieval Augmentation for Large Language Models. [Findings of EMNLP, 2023] [paper]
Inspection
- CGMH: Constrained Sentence Generation by Metropolis-Hastings Sampling. [AAAI 2019] [paper]
- Gradient-Based Constrained Sampling from Language Models. [EMNLP 2022] [paper]
- Large Language Models are Better Reasoners with Self-Verification. [Findings of EMNLP 2023] [paper]
- Explainability for Large Language Models: A Survey. [arXiv, 2023] [paper]
- Self-Consistency Improves Chain of Thought Reasoning in Language Models. [ICLR, 2023] [paper]
- Enhancing Chain-of-Thoughts Prompting with Iterative Bootstrapping in Large Language Models. [arXiv, 2023] [paper]
- Mutual Information Alleviates Hallucinations in Abstractive Summarization. [EMNLP, 2023] [paper]
- Overthinking the Truth: Understanding how Language Models Process False Demonstrations. [arXiv, 2023] [paper]
- Inference-Time Intervention: Eliciting Truthful Answers from a Language Model. [NeurIPS, 2023] [paper]
Acknowledgment
We sincerely thank the valuable feedback from many domain experts including Xiaobo Peng (Autohome), Ligeng Chen (Honor Device), Miao Wei, Pengpeng He (Huawei), Hansheng Hong, Wenjun Chen, Zhiyao Yang (Oppo), Xuesheng Qi (vivo), Liang Tao, Lishun Sun, Shuang Dong (Xiaomi), and the anonymous others.
Citation
@article{li2024personal_llm_agents,
title={Personal LLM Agents: Insights and Survey about the Capability, Efficiency and Security},
author={Yuanchun Li and Hao Wen and Weijun Wang and Xiangyu Li and Yizhen Yuan and Guohong Liu and Jiacheng Liu and Wenxing Xu and Xiang Wang and Yi Sun and Rui Kong and Yile Wang and Hanfei Geng and Jian Luan and Xuefeng Jin and Zilong Ye and Guanjing Xiong and Fan Zhang and Xiang Li and Mengwei Xu and Zhijun Li and Peng Li and Yang Liu and Ya-Qin Zhang and Yunxin Liu},
year={2024},
journal={arXiv preprint arXiv:2401.05459}
}