Home

Awesome

🐊 Large Language Models as Instructors: A Study on Multilingual Clinical Entity Extraction

<a href="https://pytorch.org/get-started/locally/"><img alt="PyTorch" src="https://img.shields.io/badge/PyTorch-ee4c2c?logo=pytorch&logoColor=white"></a> <a href="https://pytorchlightning.ai/"><img alt="Lightning" src="https://img.shields.io/badge/-Lightning-792ee5?logo=pytorchlightning&logoColor=white"></a> <a href="https://hydra.cc/"><img alt="Config: Hydra" src="https://img.shields.io/badge/Config-Hydra-89b8cd"></a> <a href="https://github.com/ashleve/lightning-hydra-template"><img alt="Template" src="https://img.shields.io/badge/-Lightning--Hydra--Template-017F2F?style=flat&logo=github&labelColor=gray"></a><br> Paper Conference

</div>

👁️ Description

This project is the codebase used for our weak supervision experiments using E3C dataset annotated with InstructGPT-3 and dictionary.

Considering the E3C dataset, we have compared the models trained with both annotations on the whole language in monolingual and multilingual contexts.

🚀 Quick start

poetry install

Train model with default configuration

Train model with chosen experiment configuration from configs/experiment/

python weak_supervision/train.py experiment={experiment_name}

You can override any parameter from command line like this:

python weak_supervision/train.py trainer.max_epochs=20 data.batch_size=64

To deploy the project run:

docker build -t weak_supervision .
docker run -v $(pwd):/workspace/project -e WANDB_API_KEY=$WANDB_API_KEY --gpus all -it  --rm weak_supervision zsh

⚗️ Experiments

here is a description for each experiment consigned in the Makefile. You see the configuration inside hydra folder configs/experiment: