Home

Awesome

<p> <img align="left" width="110" height="120" src="weasel.jpg"> </p> <div align="center">

WeaSEL: Weakly Supervised End-to-end Learning

<a href="https://pytorch.org/get-started/locally/"><img alt="Python" src="https://img.shields.io/badge/-Python 3.7--3.9-blue?style=for-the-badge&logo=python&logoColor=white"></a> <a href="https://pytorch.org/get-started/locally/"><img alt="PyTorch" src="https://img.shields.io/badge/-PyTorch 1.7+-ee4c2c?style=for-the-badge&logo=pytorch&logoColor=white"></a> <a href="https://pytorchlightning.ai/"><img alt="Lightning" src="https://img.shields.io/badge/-Lightning-792ee5?style=for-the-badge&logo=pytorchlightning&logoColor=white"></a> <a href="https://hydra.cc/"><img alt="Config: hydra" src="https://img.shields.io/badge/config-hydra-89b8cd?style=for-the-badge&labelColor=gray"></a> license

This is a PyTorch-Lightning-based framework, based on our End-to-End Weak Supervision paper (NeurIPS 2021), that allows you to train your favorite neural network for weakly-supervised classification<sup>1</sup>

</div>

<sup>1</sup> This includes learning from crowdsourced labels or annotations! <br> <sup>2</sup> LFs are labeling heuristics, that output noisy labels for (subsets of) the training data (e.g. crowdworkers or keyword detectors).

If you use this code, please consider citing our work

End-to-End Weak Supervision
Salva Rühling Cachay, Benedikt Boecking, and Artur Dubrawski
Advances in Neural Information Processing Systems (NeurIPS), 2021
arXiv:2107.02233v3

<details><p> <summary><b> Credits</b></summary> </p></details>

Getting Started

This library assumes familiarity with (multi-source) weak supervision, if that's not the case you may want to first learn its basics in e.g. this overview slides from Stanford or this Snorkel tutorial.

That being said, have a look at our examples and the notebooks therein showing you how to use Weasel for your own dataset, LF set, or end-model. E.g.:

Reproducibility

Please have a look at the research code branch, which operates on pure PyTorch.

Installation

<details> <summary><b>1. New environment </b>(recommended, but optional)</summary>
conda create --name weasel python=3.9
conda activate weasel  
</details> <details> <summary><b> 2a: From source</b></summary>
python -m pip install git+https://github.com/autonlab/weasel#egg=weasel[all]
</details> <details> <summary><b> 2b: From source, <a href="https://huggingface.co/transformers/installation.html#editable-install">editable install</a></b></summary>
git clone https://github.com/autonlab/weasel.git
cd weasel
pip install -e .[all]
</details> <details><p> <summary><b>Minimal dependencies</b></summary>

Minimal dependencies, in particular not using Hydra, can be installed with

python -m pip install git+https://github.com/autonlab/weasel

The needed environment corresponds to conda env create -f env_gpu_minimal.yml.

If you choose to use this variant, you won't be able to run some of the examples: You may want to have a look at this notebook that walks you through how to use Weasel without Hydra as the config manager.

</p></details>

Note: Weasel is under active development, some uncovered edge cases might exist, and any feedback is very welcomed!

Apply WeaSEL to your own problem

Configuration with Hydra

Optional: This template config will help you get started with your own application, an analogous config is used in this tutorial script that you may want to check out.

Pre-defined or custom downstream models & Baselines

Please have a look at the detailed instructions in this Readme.

Using your own dataset and/or labeling heuristics

Please have a look at the detailed instructions in this Readme.

Citation

@article{cachay2021endtoend,
  author={R{\"u}hling Cachay, Salva and Boecking, Benedikt and Dubrawski, Artur},
  journal={Advances in Neural Information Processing Systems}, 
  title={End-to-End Weak Supervision},
  year={2021}
}