Awesome

Lemur: Open Foundation Models for Language Agents

Lemur is an openly accessible language model optimized for both natural language and coding capabilities to serve as the backbone of versatile language agents. As language models continue to evolve from conversational chatbots to functional agents that can act in the real world, they need both strong language understanding and the ability to execute actions. Lemur balances natural language and coding skills to enable agents to follow instructions, reason for tasks, and take grounded actions.

Please refer to our paper and code for more details:

[Paper] Lemur: Harmonizing Natural Language and Code for Language Agents
[Blog] Introducing Lemur: Open Foundation Models for Language Agents

🔥 News

[18 October, 2023]: 🎉 We open-sourced code for OpenAgents: An Open Platform for Language Agents in the Wild.
[11 October, 2023]: 🎉 We released the research paper and codebase. We will continue updating this repository.
[23 August, 2023]: 🎉 We released the weights of OpenLemur/lemur-70b-v1, and OpenLemur/lemur-70b-chat-v1! Check it out in HuggingFace Hub.

Models

We released our models on the HuggingFace Hub:

Lemur: Open Foundation Models for Language Agents

Why Lemur?

Most existing open-source models specialize in either natural language or code. Lemur combines both strengths by:

Pretraining Llama-2-70B on a 90B token corpus with the 10:1 ratio of code to text and obtaining Lemur-70B-v1
Instruction tuning Lemur-70B-v1 on 300K examples covering both text and code and obtaining Lemur-70B-Chat-v1

This two-stage training produces state-of-the-art performance averaged across diverse language and coding benchmarks, surpassing other available open-source models and narrowing the gap between open-source and commercial models on agent abilities.

Quickstart

Setup

First, we have to install all the libraries listed in requirements.txt

conda create -n xchat python=3.10
conda activate xchat
conda install pytorch==2.0.1 pytorch-cuda=11.8 -c pytorch -c nvidia
conda install -c "nvidia/label/cuda-11.8.0" cuda-nvcc

Then, install the xchat package:

git clone git@github.com:OpenLemur/Lemur.git
cd Lemur
pip install -e .

Lemur-70B

For the base model lemur-70b-v1, you can use it in this way:

<details> <summary>Click me</summary> <p>

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("OpenLemur/lemur-70b-v1")
model = AutoModelForCausalLM.from_pretrained("OpenLemur/lemur-70b-v1", device_map="auto", load_in_8bit=True)

# Text Generation Example
prompt = "The world is "
input = tokenizer(prompt, return_tensors="pt")
output = model.generate(**input, max_length=50, num_return_sequences=1)
generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
print(generated_text)

# Code Generation Example
prompt = """
def factorial(n):
   if n == 0:
      return 1
"""
input = tokenizer(prompt, return_tensors="pt")
output = model.generate(**input, max_length=200, num_return_sequences=1)
generated_code = tokenizer.decode(output[0], skip_special_tokens=True)
print(generated_code)

</p> </details>

Lemur-70B-Chat

We instruction-finetune lemur-70b-v1 model with ChatML format to obtain lemur-70b-chat-v1. You can use lemur-70b-chat-v1 in this way:

<details> <summary>Click me</summary> <p>

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("OpenLemur/lemur-70b-chat-v1")
model = AutoModelForCausalLM.from_pretrained("OpenLemur/lemur-70b-chat-v1", device_map="auto", load_in_8bit=True)

# Text Generation Example
prompt = """<|im_start|>system
You are a helpful, respectful, and honest assistant.
<|im_end|>
<|im_start|>user
What's a lemur's favorite fruit?<|im_end|>
<|im_start|>assistant
"""
input = tokenizer(prompt, return_tensors="pt")
output = model.generate(**input, max_length=50, num_return_sequences=1)
generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
print(generated_text)

# Code Generation Example
prompt = """<|im_start|>system
Below is an instruction that describes a task. Write a response that appropriately completes the request.
<|im_end|>
<|im_start|>user
Write a Python function to merge two sorted lists into one sorted list without using any built-in sort functions.<|im_end|>
<|im_start|>assistant
"""
input = tokenizer(prompt, return_tensors="pt")
output = model.generate(**input, max_length=200, num_return_sequences=1)
generated_code = tokenizer.decode(output[0], skip_special_tokens=True)
print(generated_code)

</p> </details>

Training

Evaluation

We evaluated Lemur across:

8 language and code datasets like MMLU, BBH, GSM8K, HumanEval, and Spider to validate balanced capabilities
13 interactive agent datasets to test skills like tool usage, adapting to feedback from environments or humans, and exploring partially observable digital or physical environments.

Foundational Abilities

We build the evaluation suite based on open-instruct. We will keep updating more tasks and models.

Currently, we support the following tasks:

[✅] MMLU
[✅] BBH
[✅] GSM8K
[✅] HumanEval
[✅] MBPP
[🚧] Spider
[🚧] MultiPL-E
[🚧] DS-1000
[🚧] ...

Interactive Agent Skills

We use the evaluation frameworks provided by MINT, InterCode, and WebArena to evaluate interactive agent skills.

Deploy

We use vLLM to serve the Lemur model. However, the official FastChat codebase does not yet support Lemur-Chat. Therefore, we provide a docker to serve vLLM for Lemur. Please refer to vllm_lemur.sh for more detailed information.

bash scripts/deploy/vllm_lemur.sh

MINT

We fork MINT codebase to share the configs we used. Please refer to this config folder for more details. Please run vllm with vllm_lemur.sh script.

WebArena

We fork WebArena codebase to enable vLLM evaluation. To run the evaluation on WebArena, please refer to our forked WebArena codebase.

InterCode

We fork InterCode codebase and do modifications to enable Lemur evaluation. Please refer to this script folder for more details. Please run text-generation-inference with tgi_lemur.sh script.

Citation

If you find our work helpful, please cite us:

@misc{xu2023lemur,
      title={Lemur: Harmonizing Natural Language and Code for Language Agents}, 
      author={Yiheng Xu and Hongjin Su and Chen Xing and Boyu Mi and Qian Liu and Weijia Shi and Binyuan Hui and Fan Zhou and Yitao Liu and Tianbao Xie and Zhoujun Cheng and Siheng Zhao and Lingpeng Kong and Bailin Wang and Caiming Xiong and Tao Yu},
      year={2023},
      eprint={2310.06830},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Acknowledgements

The Lemur project is an open collaborative research effort between XLang Lab and Salesforce Research. We thank the following institutions for their gift support: