Awesome
Medplexity
<p align="center"> <a href="https://www.medplexityai.com/">Medplexity explorer</a> • <a href="https://github.com/MaksymPetyak/medplexity-frontend">Frontend GitHub repository</a> • <a href="https://medplexity.substack.com/">Substack</a> </p>Medplexity is a python library to help with evaluation of LLMs for medical applications.
<img src="images/medplexity-logo.png" alt="medplexity-logo" width="512px" style="border-radius: 16px;"/>It is designed to help with the following tasks:
- Evaluating performance of LLMs on existing medical datasets and benchmarks. E.g. MedQA, PubMedQA, etc.
- Comparing performance of different prompts, models, and architectures.
- Exporting results of evaluation for visualisation and further analysis.
The goal is to help answer questions like "How much better would GPT-4 perform given a vector database to load certain resources?".
🔧 Quick install
pip install medplexity
📖 Documentation
Documentation can be found here.
Example
See our "Getting Started" notebook for a full example with MedMCQA dataset.
Contributions
Contributions are welcome! Check out the todos below, and feel free to open a pull request.
Remember to install pre-commit
to be compliant with our standards:
pre-commit install
Feel free to raise any questions on Discord
Explorer
In addition to the library, we are also building a web app to explore the results of evaluations. The explorer is available at medplexityai.com. It's also open-sourced, see the frontend repository.
📜 License
Medplexity is licensed under the MIT License. See the LICENSE file for more details.