Home

Awesome

DISCONTINUATION OF PROJECT

This project will no longer be maintained by Intel. Intel has ceased development and contributions including, but not limited to, maintenance, bug fixes, new releases, or updates, to this project. Intel no longer accepts patches to this project.

MetricsDiscoveryHelper

MetricsDiscoveryHelper is a runtime library that helps configure and collect Intel GPU metrics using the Intel® Metrics Discovery API. The intent is that this is a fairly thin layer that:

Please see CONTRIBUTING for information on how to request features, report issues, or contribute code changes.

License

Copyright 2018 Intel Corporation

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

MetricsDiscovery API

Intel® Metrics Discovery API is the Intel graphics driver's API providing access to GPU architecture properties, metrics, and override controls. For more information on Intel GPU architecture (which is useful to understand all the metrics) refer to Intel's Graphics Developer Guides and/or Intel's Programmer's Reference Manuals.

See metrics_discovery_helper.h for documentation on how to use this library, as well as the following samples:

ExampleDescription
samples/list_metricsEnumerate all global symbols, overrides, concurrent groups, metric sets, metrics, and information available on the system.
samples/periodic_sampleDemonstrates sampling metrics at a periodic interval.
samples/range_sample_dx11Demonstrates sampling metrics around a range of DX11 commands.

Metrics are organized into a hierarchy of metric sets within concurrent groups. Within each concurrent group, you can only activate a single metric set at a time. However, one metric set can be active from each concurrent group at the same time.

Particular metrics are architecture specific, and may not be available on other GPUs. The following are example metrics from the "RenderBasic" metric set in the "OA" concurrent group available on a Skylake (Gen9) GT2 GPU. There are more metrics available in other metric sets; use the list_metrics sample to list all the metrics available on your system.

Metric SymbolNameDescription
GpuTimeTime elapsed on the GPU during the measurement.
GpuCoreClocksThe total number of GPU core clocks elapsed during the measurement.
AvgGpuCoreFrequencyMHzAverage GPU Core Frequency in the measurement.
VsThreadsThe total number of vertex shader hardware threads dispatched.
HsThreadsThe total number of hull shader hardware threads dispatched.
DsThreadsThe total number of domain shader hardware threads dispatched.
GsThreadsThe total number of geometry shader hardware threads dispatched.
PsThreadsThe total number of pixel shader hardware threads dispatched.
CsThreadsThe total number of compute shader hardware threads dispatched.
GpuBusyThe percentage of time in which the GPU has been processing GPU commands.
EuActiveThe percentage of time in which the Execution Units were actively processing.
EuStallThe percentage of time in which the Execution Units were stalled.
EuFpuBothActiveThe percentage of time in which both EU FPU pipelines were actively processing.
VsFpu0ActiveThe percentage of time in which EU FPU0 pipeline was actively processing a vertex shader instruction.
VsFpu1ActiveThe percentage of time in which EU FPU1 pipeline was actively processing a vertex shader instruction.
VsSendActiveThe percentage of time in which EU send pipeline was actively processing a vertex shader instruction.
PsFpu0ActiveThe percentage of time in which EU FPU0 pipeline was actively processing a pixel shader instruction.
PsFpu1ActiveThe percentage of time in which EU FPU1 pipeline was actively processing a pixel shader instruction.
PsSendActiveThe percentage of time in which EU send pipeline was actively processing a pixel shader instruction.
PsEuBothFpuActiveThe percentage of time in which pixel shaders were processed actively on the both FPUs.
Sampler0BusyThe percentage of time in which Sampler 0 has been processing EU requests.
Sampler1BusyThe percentage of time in which Sampler 1 has been processing EU requests.
SamplersBusyThe percentage of time in which samplers have been processing EU requests.
Sampler0BottleneckThe percentage of time in which Sampler 0 has been slowing down the pipe when processing EU requests.
Sampler1BottleneckThe percentage of time in which Sampler 1 has been slowing down the pipe when processing EU requests.
RasterizedPixelsThe total number of rasterized pixels.
HiDepthTestFailsThe total number of pixels dropped on early hierarchical depth test.
EarlyDepthTestFailsThe total number of pixels dropped on early depth test.
SamplesKilledInPsThe total number of samples or pixels dropped in pixel shaders.
PixelsFailingPostPsTestsThe total number of pixels dropped on post-PS alpha, stencil, or depth tests.
SamplesWrittenThe total number of samples or pixels written to all render targets.
SamplesBlendedThe total number of blended samples or pixels written to all render targets.
SamplerTexelsThe total number of texels seen on input (with 2x2 accuracy) in all sampler units.
SamplerTexelMissesThe total number of texels lookups (with 2x2 accuracy) that missed L1 sampler cache.
SamplerL1MissesThe total number of sampler cache misses in all LODs in all sampler units.
SlmBytesReadThe total number of GPU memory bytes read from shared local memory.
SlmBytesWrittenThe total number of GPU memory bytes written into shared local memory.
ShaderMemoryAccessesThe total number of shader memory accesses to L3.
ShaderAtomicsThe total number of shader atomic memory accesses.
L3LookupsThe total number of L3 cache lookup accesses w/o IC.
L3MissesThe total number of L3 misses.
L3SamplerThroughputThe total number of GPU memory bytes transferred between samplers and L3 caches.
L3ShaderThroughputThe total number of GPU memory bytes transferred between shaders and L3 caches w/o URB.
ShaderBarriersThe total number of shader barrier messages.
GtiVfThroughputThe total number of GPU memory bytes transferred between 3D Pipeline (Command Dispatch, Input Assembly and Stream Output) and GTI.
GtiDepthThroughputThe total number of GPU memory bytes transferred between depth caches and GTI.
GtiRccThroughputThe total number of GPU memory bytes transferred between render color caches and GTI.
GtiL3ThroughputThe total number of GPU memory bytes transferred between L3 caches and GTI.
GtiHdcLookupsThroughputThe total number of GPU memory bytes transferred between GTI and HDC, when HDC is doing TLB lookups.
GtiReadThroughputThe total number of GPU memory bytes read from GTI.
GtiWriteThroughputThe total number of GPU memory bytes written to GTI.
SamplerBottleneckThe percentage of time in which samplers have been slowing down the pipe when processing EU requests.
QueryBeginTimeThe measurement begin time.
CoreFrequencyMHzThe last GPU core (unslice) frequency in the measurement.
EuSliceFrequencyMHzThe last GPU Execution Unit slice frequency in the measurement.
ReportReasonThe reason of the report.
ContextIdThe context tag in which report has been taken.
CoreFrequencyChangedThe flag indicating that GPU core frequency has changed.
QuerySplitOccurredThe flag indicating that query has been split during execution on the GPU.
ReportIdQuery report identification number.
ReportsCountThe number of available query reports.
OverrunOccuredThe flag indicating that Oa buffer has been overran.
SampleSequenceNumber identifying report sequence.
SamplePhaseBuffer execution phase.
GpuNodeGPU node buffer was sent to.