Home

Awesome

MLPerf™ Inference Benchmark Suite

MLPerf Inference is a benchmark suite for measuring how fast systems can run models in a variety of deployment scenarios.

Please see the MLPerf Inference benchmark paper for a detailed description of the benchmarks along with the motivation and guiding principles behind the benchmark suite. If you use any part of this benchmark (e.g., reference implementations, submissions, etc.), please cite the following:

@misc{reddi2019mlperf,
    title={MLPerf Inference Benchmark},
    author={Vijay Janapa Reddi and Christine Cheng and David Kanter and Peter Mattson and Guenther Schmuelling and Carole-Jean Wu and Brian Anderson and Maximilien Breughe and Mark Charlebois and William Chou and Ramesh Chukka and Cody Coleman and Sam Davis and Pan Deng and Greg Diamos and Jared Duke and Dave Fick and J. Scott Gardner and Itay Hubara and Sachin Idgunji and Thomas B. Jablin and Jeff Jiao and Tom St. John and Pankaj Kanwar and David Lee and Jeffery Liao and Anton Lokhmotov and Francisco Massa and Peng Meng and Paulius Micikevicius and Colin Osborne and Gennady Pekhimenko and Arun Tejusve Raghunath Rajan and Dilip Sequeira and Ashish Sirasao and Fei Sun and Hanlin Tang and Michael Thomson and Frank Wei and Ephrem Wu and Lingjie Xu and Koichi Yamada and Bing Yu and George Yuan and Aaron Zhong and Peizhao Zhang and Yuchen Zhou},
    year={2019},
    eprint={1911.02549},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

Please see here for the MLPerf inference documentation website which includes automated commands to run MLPerf inference benchmarks using different implementations.

MLPerf Inference v4.1 (submission deadline July 26, 2024)

For submissions, please use the master branch and any commit since the 4.1 seed release although it is best to use the latest commit. v4.1 tag will be created from the master branch after the result publication.

For power submissions please use SPEC PTD 1.10 (needs special access) and any commit of the power-dev repository after the code-freeze

modelreference appframeworkdatasetcategory
resnet50-v1.5vision/classification_and_detectiontensorflow, onnx, tvm, ncnnimagenet2012edge,datacenter
retinanet 800x800vision/classification_and_detectionpytorch, onnxopenimages resized to 800x800edge,datacenter
bertlanguage/berttensorflow, pytorch, onnxsquad-1.1edge,datacenter
dlrm-v2recommendation/dlrm_v2pytorchMultihot Criteo Terabytedatacenter
3d-unetvision/medical_imaging/3d-unet-kits19pytorch, tensorflow, onnxKiTS19edge,datacenter
gpt-jlanguage/gpt-jpytorchCNN-Daily Mailedge,datacenter
stable-diffusion-xltext_to_imagepytorchCOCO 2014edge,datacenter
llama2-70blanguage/llama2-70bpytorchOpenOrcadatacenter
mixtral-8x7blanguage/mixtral-8x7bpytorchOpenOrca, MBXP, GSM8Kdatacenter

MLPerf Inference v4.0 (submission February 23, 2024)

There is an extra one-week extension allowed only for the llama2-70b submissions. For submissions, please use the master branch and any commit since the 4.0 seed release although it is best to use the latest commit. v4.0 tag will be created from the master branch after the result publication.

For power submissions please use SPEC PTD 1.10 (needs special access) and any commit of the power-dev repository after the code-freeze

modelreference appframeworkdatasetcategory
resnet50-v1.5vision/classification_and_detectiontensorflow, onnx, tvm, ncnnimagenet2012edge,datacenter
retinanet 800x800vision/classification_and_detectionpytorch, onnxopenimages resized to 800x800edge,datacenter
bertlanguage/berttensorflow, pytorch, onnxsquad-1.1edge,datacenter
dlrm-v2recommendation/dlrm_v2pytorchMultihot Criteo Terabytedatacenter
3d-unetvision/medical_imaging/3d-unet-kits19pytorch, tensorflow, onnxKiTS19edge,datacenter
rnntspeech_recognition/rnntpytorchOpenSLR LibriSpeech Corpusedge,datacenter
gpt-jlanguage/gpt-jpytorchCNN-Daily Mailedge,datacenter
stable-diffusion-xltext_to_imagepytorchCOCO 2014edge,datacenter
llama2-70blanguage/llama2-70bpytorchOpenOrcadatacenter

MLPerf Inference v3.1 (submission August 18, 2023)

Please use v3.1 tag (git checkout v3.1) if you would like to reproduce the v3.1 results.

For reproducing power submissions please use the master branch of the MLCommons power-dev repository and checkout to e9e16b1299ef61a2a5d8b9abf5d759309293c440.

You can see the individual README files in the benchmark task folders for more details regarding the benchmarks. For reproducing the submitted results please see the README files under the respective submitter folders in the inference v3.1 results repository.

modelreference appframeworkdatasetcategory
resnet50-v1.5vision/classification_and_detectiontensorflow, onnx, tvm, ncnnimagenet2012edge,datacenter
retinanet 800x800vision/classification_and_detectionpytorch, onnxopenimages resized to 800x800edge,datacenter
bertlanguage/berttensorflow, pytorch, onnxsquad-1.1edge,datacenter
dlrm-v2recommendation/dlrm_v2pytorchMultihot Criteo Terabytedatacenter
3d-unetvision/medical_imaging/3d-unet-kits19pytorch, tensorflow, onnxKiTS19edge,datacenter
rnntspeech_recognition/rnntpytorchOpenSLR LibriSpeech Corpusedge,datacenter
gpt-jlanguage/gpt-jpytorchCNN-Daily Mailedge,datacenter

MLPerf Inference v3.0 (submission 03/03/2023)

Please use the v3.0 tag (git checkout v3.0) if you would like to reproduce v3.0 results.

You can see the individual Readme files in the reference app for more details.

modelreference appframeworkdatasetcategory
resnet50-v1.5vision/classification_and_detectiontensorflow, onnx, tvmimagenet2012edge,datacenter
retinanet 800x800vision/classification_and_detectionpytorch, onnxopenimages resized to 800x800edge,datacenter
bertlanguage/berttensorflow, pytorch, onnxsquad-1.1edge,datacenter
dlrmrecommendation/dlrmpytorch, tensorflowCriteo Terabytedatacenter
3d-unetvision/medical_imaging/3d-unet-kits19pytorch, tensorflow, onnxKiTS19edge,datacenter
rnntspeech_recognition/rnntpytorchOpenSLR LibriSpeech Corpusedge,datacenter

MLPerf Inference v2.1 (submission 08/05/2022)

Use the r2.1 branch (git checkout r2.1) if you want to submit or reproduce v2.1 results.

See the individual Readme files in the reference app for details.

modelreference appframeworkdatasetcategory
resnet50-v1.5vision/classification_and_detectiontensorflow, onnximagenet2012edge,datacenter
retinanet 800x800vision/classification_and_detectionpytorch, onnxopenimages resized to 800x800edge,datacenter
bertlanguage/berttensorflow, pytorch, onnxsquad-1.1edge,datacenter
dlrmrecommendation/dlrmpytorch, tensorflowCriteo Terabytedatacenter
3d-unetvision/medical_imaging/3d-unet-kits19pytorch, tensorflow, onnxKiTS19edge,datacenter
rnntspeech_recognition/rnntpytorchOpenSLR LibriSpeech Corpusedge,datacenter

MLPerf Inference v2.0 (submission 02/25/2022)

Use the r2.0 branch (git checkout r2.0) if you want to submit or reproduce v2.0 results.

See the individual Readme files in the reference app for details.

modelreference appframeworkdatasetcategory
resnet50-v1.5vision/classification_and_detectiontensorflow, onnximagenet2012edge,datacenter
ssd-mobilenet 300x300vision/classification_and_detectiontensorflow, pytorch, onnxcoco resized to 300x300edge
ssd-resnet34 1200x1200vision/classification_and_detectiontensorflow, pytorch, onnxcoco resized to 1200x1200edge,datacenter
bertlanguage/berttensorflow, pytorch, onnxsquad-1.1edge,datacenter
dlrmrecommendation/dlrmpytorch, tensorflowCriteo Terabytedatacenter
3d-unetvision/medical_imaging/3d-unet-kits19pytorch, tensorflow, onnxKiTS19edge,datacenter
rnntspeech_recognition/rnntpytorchOpenSLR LibriSpeech Corpusedge,datacenter

MLPerf Inference v1.1 (submission 08/13/2021)

Use the r1.1 branch (git checkout r1.1) if you want to submit or reproduce v1.1 results.

See the individual Readme files in the reference app for details.

modelreference appframeworkdatasetcategory
resnet50-v1.5vision/classification_and_detectiontensorflow, onnximagenet2012edge,datacenter
ssd-mobilenet 300x300vision/classification_and_detectiontensorflow, pytorch, onnxcoco resized to 300x300edge
ssd-resnet34 1200x1200vision/classification_and_detectiontensorflow, pytorch, onnxcoco resized to 1200x1200edge,datacenter
bertlanguage/berttensorflow, pytorch, onnxsquad-1.1edge,datacenter
dlrmrecommendation/dlrmpytorch, tensorflowCriteo Terabytedatacenter
3d-unetvision/medical_imaging/3d-unetpytorch, tensorflow(?), onnx(?)BraTS 2019edge,datacenter
rnntspeech_recognition/rnntpytorchOpenSLR LibriSpeech Corpusedge,datacenter

MLPerf Inference v1.0 (submission 03/19/2021)

Use the r1.0 branch (git checkout r1.0) if you want to submit or reproduce v1.0 results.

See the individual Readme files in the reference app for details.

modelreference appframeworkdatasetcategory
resnet50-v1.5vision/classification_and_detectiontensorflow, onnximagenet2012edge,datacenter
ssd-mobilenet 300x300vision/classification_and_detectiontensorflow, pytorch, onnxcoco resized to 300x300edge
ssd-resnet34 1200x1200vision/classification_and_detectiontensorflow, pytorch, onnxcoco resized to 1200x1200edge,datacenter
bertlanguage/berttensorflow, pytorch, onnxsquad-1.1edge,datacenter
dlrmrecommendation/dlrmpytorch, tensorflow(?)Criteo Terabytedatacenter
3d-unetvision/medical_imaging/3d-unetpytorch, tensorflow(?), onnx(?)BraTS 2019edge,datacenter
rnntspeech_recognition/rnntpytorchOpenSLR LibriSpeech Corpusedge,datacenter

MLPerf Inference v0.7 (submission 9/18/2020)

Use the r0.7 branch (git checkout r0.7) if you want to submit or reproduce v0.7 results.

See the individual Readme files in the reference app for details.

modelreference appframeworkdataset
resnet50-v1.5vision/classification_and_detectiontensorflow, pytorch, onnximagenet2012
ssd-mobilenet 300x300vision/classification_and_detectiontensorflow, pytorch, onnxcoco resized to 300x300
ssd-resnet34 1200x1200vision/classification_and_detectiontensorflow, pytorch, onnxcoco resized to 1200x1200
bertlanguage/berttensorflow, pytorch, onnxsquad-1.1
dlrmrecommendation/dlrmpytorch, tensorflow(?), onnx(?)Criteo Terabyte
3d-unetvision/medical_imaging/3d-unetpytorch, tensorflow(?), onnx(?)BraTS 2019
rnntspeech_recognition/rnntpytorchOpenSLR LibriSpeech Corpus

MLPerf Inference v0.5

Use the r0.5 branch (git checkout r0.5) if you want to reproduce v0.5 results.

See the individual Readme files in the reference app for details.

modelreference appframeworkdataset
resnet50-v1.5v0.5/classification_and_detectiontensorflow, pytorch, onnximagenet2012
mobilenet-v1v0.5/classification_and_detectiontensorflow, pytorch, onnximagenet2012
ssd-mobilenet 300x300v0.5/classification_and_detectiontensorflow, pytorch, onnxcoco resized to 300x300
ssd-resnet34 1200x1200v0.5/classification_and_detectiontensorflow, pytorch, onnxcoco resized to 1200x1200
gnmtv0.5/translation/gnmt/tensorflow, pytorchSee Readme