Home

Awesome

Multi-Candidate Speculative Decoding

Code Release

See here.

Data Release

For Alpaca dataset, we use exactly the same exact source as SpecInfer.

For the WMT dataset, we follow the process of SpecInfer: randomly sampling 1000 samples from the test set. We wrap the source sentences using the following template:

Translate the input English sentence into German.
Input: {source sentence}
Output: 

Model Release

We release our fine-tuned draft models on hugginface, see Vicuna-68M and Vicuna-160M. They are fine-tuned from LLaMA-68M and LLaMA-160M respectively on ShareGPT data. The training setup follows FastChat.