Home

Awesome

NRD SAMPLE

All-in-one repository including all relevant pieces to see NRD (NVIDIA Real-time Denoisers) in action. The sample is cross-platform, it's based on NRI (NVIDIA Rendering Interface) to bring cross-GraphicsAPI support.

NRD sample is a land for high performance path tracing for games. Some features to highlight:

A NOTE ABOUT THE TRACER

The path tracer in the sample has been designed to respect performance. Instead of using a commonly used solution, which in general looks like:

// Resources
ByteAddressBuffer g_BindlessBuffers[];
Texture2D g_BindlessTextures[];
StructuredBuffer<InstanceData> g_InstanceData;
StructuredBuffer<GeometryData> g_GeometryData;
StructuredBuffer<MaterialData> g_MaterialData;

// Geometry fetching
uint instanceIndex = rayQuery.InstanceIndex();
uint geometryIndex = rayQuery.GeometryIndex();
uint primitiveIndex = rayQuery.PrimitiveIndex();

InstanceData instanceData = g_InstanceData[ instanceIndex ];
GeometryData geometryData = g_GeometryData[ instanceData.geometryBaseIndex + geometryIndex ];

ByteAddressBuffer indexBuffer = g_BindlessBuffers[ NonUniformResourceIndex( geometryData.indexBufferIndex ) ];
ByteAddressBuffer vertexBuffer = g_BindlessBuffers[ NonUniformResourceIndex( geometryData.vertexBufferIndex ) ];

uint3 indices = indexBuffer.Load3( geometryData.indexOffset + primitiveIndex * INDEX_STRIDE );

float3 p0 = DecodePosition( vertexBuffer.Load3( geometryData.vertexOffset + indices[0] * VERTEX_STRIDE ) );
float3 p1 = DecodePosition( vertexBuffer.Load3( geometryData.vertexOffset + indices[1] * VERTEX_STRIDE ) );
float3 p2 = DecodePosition( vertexBuffer.Load3( geometryData.vertexOffset + indices[2] * VERTEX_STRIDE ) );
float3 p = Interpolate( p0, p1, p2, barycentrics );

float3 n0 = DecodeNormal( vertexBuffer.Load3( geometryData.vertexOffset + offset1 + indices[0] * VERTEX_STRIDE ) );
float3 n1 = DecodeNormal( vertexBuffer.Load3( geometryData.vertexOffset + offset1 + indices[1] * VERTEX_STRIDE ) );
float3 n2 = DecodeNormal( vertexBuffer.Load3( geometryData.vertexOffset + offset1 + indices[2] * VERTEX_STRIDE ) );
float3 n = Interpolate( n0, n1, n2, barycentrics );
n = Rotate( instanceData.transform );

float2 uv0 = DecodeUv( vertexBuffer.Load2( geometryData.vertexOffset + offset2 + indices[0] * VERTEX_STRIDE ) );
float2 uv1 = DecodeUv( vertexBuffer.Load2( geometryData.vertexOffset + offset2 + indices[1] * VERTEX_STRIDE ) );
float2 uv2 = DecodeUv( vertexBuffer.Load2( geometryData.vertexOffset + offset2 + indices[2] * VERTEX_STRIDE ) );
float2 uv = Interpolate( uv0, uv1, uv2, barycentrics );

// Material fetching
MaterialData materialData = g_MaterialData[ geometryData.materialIndex ];

Texture2D texture1 = g_BindlessTextures[ NonUniformResourceIndex( materialData.textureIndex1 ) ];
float4 data1 = texture1.SampleLevel( ... );

Texture2D texture2 = g_BindlessTextures[ NonUniformResourceIndex( materialData.textureIndex2 ) ];
float4 data2 = texture2.SampleLevel( ... );

Texture2D texture3 = g_BindlessTextures[ NonUniformResourceIndex( materialData.textureIndex3 ) ];
float4 data1 = texture3.SampleLevel( ... );

To get a vertex data we need:

The path tracer uses the following scheme:

// Resources
StructuredBuffer<InstanceData>, g_InstanceData;
StructuredBuffer<PrimitiveData> g_PrimitiveData;
Texture2D g_BindlessTextures[];

// Geometry fetching
uint instanceIndex = rayQuery.InstanceIndex();
uint geometryIndex = rayQuery.GeometryIndex();
uint primitiveIndex = rayQuery.PrimitiveIndex();

InstanceData instanceData = g_InstanceData[ instanceIndex + geometryIndex ];
PrimitiveData primitiveData = g_PrimitiveData[ primitiveIndex ];

float3x3 mObjectToWorld = (float3x3)rayQuery.ObjectToWorld3x4();
if( instanceData.isStatic )
    mObjectToWorld = (float3x3)instanceData.mWorldToWorldPrev;

float3 p = rayOrigin + rayDirection * rayQuery.RayT();

float3 n0 = DecodeNormal( primitiveData.n0 );
float3 n1 = DecodeNormal( primitiveData.n1 );
float3 n2 = DecodeNormal( primitiveData.n2 );
float3 n = Interpolate( n0, n1, n2, barycentrics );
n = Rotate( mObjectToWorld );

float2 uv0 = DecodeUv( primitiveData.uv0 );
float2 uv1 = DecodeUv( primitiveData.uv1 );
float2 uv2 = DecodeUv( primitiveData.uv2 );
float2 uv = Interpolate( uv0, uv1, uv2, barycentrics );

// Material fetching
Texture2D texture1 = g_BindlessTextures[ NonUniformResourceIndex( instanceData.textureBaseIndex ) ];
float4 data1 = texture1.SampleLevel( ... );

Texture2D texture2 = g_BindlessTextures[ NonUniformResourceIndex( instanceData.textureBaseIndex + 1 ) ];
float4 data2 = texture2.SampleLevel( ... );

Texture2D texture3 = g_BindlessTextures[ NonUniformResourceIndex( instanceData.textureBaseIndex + 2 ) ];
float4 data1 = texture3.SampleLevel( ... );

To get a vertex data we need:

This approach simplifies and accelerates ray tracing, but adds difficulties to BVH management. Deleting a BLAS adds a contiguous region of free elements in g_PrimitiveData, which needs to be tracked and potentially re-used in the future when a suitable object appears. If estimated geometry sizes are known, this memory-fragmentation-free approach is more than applicable.

BUILD INSTRUCTIONS

CMAKE OPTIONS

HOW TO RUN

REQUIREMENTS

Any ray tracing compatible GPU.

USAGE

By default NRD is used in common mode. But it can also be used in occlusion-only (including directional) and SH (spherical harmonics) modes in the sample. To change the behavior NRD_MODE macro needs to be changed from NORMAL to OCCLUSION, SH or DIRECTIONAL_OCCLUSION in Shared.hlsli.

Notes: