Home

Awesome

databricks-ml-examples

databricks/databricks-ml-examples is a repository to show machine learning examples on Databricks platforms.

Currently this repository contains:

SOTA LLM examples

Databricks works with thousands of customers to build generative AI applications. While you can use Databricks to work with any generative AI model, including commercial and research, the table below lists our current model recommendations for popular use cases. Note: The table only lists open source models that are for free commercial use.

<!--- <style> table th:first-of-type { width: 10%; } table th:nth-of-type(2) { width: 30%; } table th:nth-of-type(3) { width: 30%; } table th:nth-of-type(4) { width: 30%; } </style> -->
Use caseQuality-optimizedBalancedSpeed-optimized
Text generation following instructionsMixtral-8x7B-Instruct-v0.1 <br> <br> Llama-2-70b-chat-hfmistral-7b <br><br> MPT-7B-Instruct <br> MPT-7B-8k-Instruct <br> <br> Llama-2-7b-chat-hf <br> Llama-2-13b-chat-hfphi-2
Text embeddings (English only)e5-mistral-7b-instruct(7B)bge-large-en-v1.5(0.3B) <br> e5-large-v2 (0.3B)bge-base-en-v1.5 (0.1B) <br> e5-base-v2 (0.1B)
Transcription (speech to text)whisper-large-v2(1.6B) <br> whisper-medium (0.8B)
Image generationstable-diffusion-xl
Code generationCodeLlama-70b-hf <br> CodeLlama-70b-Instruct-hf <br> CodeLlama-70b-Python-hf (Python optimized) <br>CodeLlama-34b-hf <br> CodeLlama-34b-Instruct-hf <br> CodeLlama-34b-Python-hf (Python optimized)CodeLlama-13b-hf <br> CodeLlama-13b-Instruct-hf <br> CodeLlama-13b-Python-hf (Python optimized) <br> CodeLlama-7b-hf <br> CodeLlama-7b-Instruct-hf <br> CodeLlama-7b-Python-hf (Python optimized)

Model Evaluation Leaderboard

Text generation models

The model evaluation results presented below are measured by the Mosaic Eval Gauntlet framework. This framework comprises a series of tasks specifically designed to assess the performance of language models, including widely-adopted benchmarks such as MMLU, Big-Bench, HellaSwag, and more.

Model NameCore AverageWorld KnowledgeCommonsense ReasoningLanguage UnderstandingSymbolic Problem SolvingReading Comprehension
Mistral-7B-v0.10.5220.5580.5130.5550.3420.641
falcon-40b0.5010.5560.550.5350.2690.597
falcon-40b-instruct0.50.5420.5710.5440.2640.58
Llama-2-13b-hf0.4790.5150.4820.520.2790.597
Llama-2-13b-chat-hf0.4760.5220.5120.5140.2710.559
Mistral-7B-Instruct-v0.10.4690.480.5020.4920.2660.604
mpt-30b-instruct0.4650.480.5130.4940.2380.599
mpt-30b0.4310.4940.470.4770.2340.481
Llama-2-7b-chat-hf0.420.4760.4470.4780.2210.478
Llama-2-7b-hf0.4010.4570.410.4540.2170.465
mpt-7b-8k-instruct0.360.3630.410.4050.1650.458
mpt-7b-instruct0.3540.3990.4150.3720.1710.415
mpt-7b-8k0.3540.4270.3680.4260.1710.378
falcon-7b0.3350.3710.4210.370.1590.355
mpt-7b0.3240.3560.3840.380.1630.336
falcon-7b-instruct0.3070.340.3720.3330.1080.38
<p align="center"> <img src="https://github.com/databricks/databricks-ml-examples/assets/12763339/acdfb7ce-c233-4ede-884c-4e0b4ce0a4f6" /> </p>

Other examples: