Awesome
SeaKR
Getting Started
Install Environment
conda create -n seakr python=3.10
conda activate seakr
pip install beir==1.0.1 spacy==3.7.2 aiofiles tenacity
python -m spacy download en_core_web_sm
We modify the vllm to get the uncertainty measures.
cd vllm_uncertainty
pip install -e .
Prepare Retriever
Followed by dragin. Use the Wikipedia dump and elastic search to build the retriever
Download Wikipedia dump
mkdir -p data/dpr
wget -O data/dpr/psgs_w100.tsv.gz https://dl.fbaipublicfiles.com/dpr/wikipedia_split/psgs_w100.tsv.gz
pushd data/dpr
gzip -d psgs_w100.tsv.gz
popd
Run Elasticsearch service
cd data
wget -O elasticsearch-7.17.9.tar.gz https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.17.9-linux-x86_64.tar.gz # download Elasticsearch
tar zxvf elasticsearch-7.17.9.tar.gz
rm elasticsearch-7.17.9.tar.gz
cd elasticsearch-7.17.9
nohup bin/elasticsearch & # run Elasticsearch in background
build the index
python build_wiki_index.py --data_path $YOUR_WIKIPEDIA_TSV_PATH --index_name wiki --port $YOUR_ELASTIC_SERVICE_PORT
Run SeaKR on Multihop QA
For multihop QA datasets, we use the same files as dragin. You can download and unzip it into the data/multihop_data
folder. We provide a packed multihop data files here: multihop_data.zip.
We use an asynchronous reasoning engine to accelerate multi hop reasoning.
2WikiHop
python main_multihop.py \
--n_shot 10 \
--retriever_port $YOUR_ELASTIC_SERVICE_PORT \
--dataset_name twowikihop \
--eigen_threshold -6.0 \
--save_dir "outputs/twowikihop" \
--model_name_or_path $YOUR_MODEL_CHECKPOINT_PATH \
--served_model_name llama2-7b-chat \
--max_reasoning_steps 7 \
--max_docs 5
HotpotQA
python main_multihop.py \
--n_shot 10 \
--retriever_port $YOUR_ELASTIC_SERVICE_PORT \
--dataset_name hotpotqa \
--eigen_threshold -6.0 \
--save_dir "outputs/hotpotqa" \
--model_name_or_path $YOUR_MODEL_CHECKPOINT_PATH \
--served_model_name llama2-7b-chat \
--max_reasoning_steps 7 \
--max_docs 5
IIRC
python main_multihop.py \
--n_shot 10 \
--retriever_port $YOUR_ELASTIC_SERVICE_PORT \
--dataset_name iirc \
--eigen_threshold -6.0 \
--save_dir "outputs/iirc" \
--model_name_or_path $YOUR_MODEL_CHECKPOINT_PATH \
--served_model_name llama2-7b-chat \
--max_reasoning_steps 7 \
--max_docs 5
Evaluate
We provide a jupyter notebook eval_multihop.ipynb
to do evaluation. You just need to replace the output jsonline file name with your own output.
Run SeaKR on Single QA
The original files are from DPR. We provide a packed version containing top 10 retrieved documents singlehop_data.zip. You can download and unzip it into the data
folder.
python main_simpleqa.py \
--dataset_name tq \
--model_name_or_path $YOUR_MODEL_CHECKPOINT_PATH \
--selected_intermediate_layer 15 \
--output_dir $OUTPUT_DIR
You can evaluate the output in the eval_singlehop.ipynb
notebook