Home

Awesome

VFPR - a Vulkan Forward Plus Renderer

A final project for University of Pennsylvania, CIS 565: GPU Programming and Architecture

In this project, we created a Forward Plus (tiled forward) renderer in Vulkan using compute shader to deal with light culling.

Our implementation is ~1000% faster than regular forward renderer (tested in Vulkan) under the condition of 200 lights.

Yes, Vulkan is really powerful! We learned about Vulkan at SIGGRAPH 2016 at Anaheim, and decided to dive into it at our final project. We learned a lot from Alexander Overvoorde's Vulkan Tutorial, great resource! Thanks so much!

Let us give you a very detailed introduction about our project. :)

Team Members

Tested on: Windows 10 x64, i7-6700K @ 4.00GHz 16GB, GTX 970 4096MB (Personal Desktop)

Overview of Forward Plus Technique

Our ideas of this cool renderer are from this amazing paper: Forward+: Bringing Deferred Lighting to the Next Level. Thanks so much for the incredible authors!

Forward plus actually is an extension to traditional forward rendering. In forward rendering, it normally limits the number of lights to be valued when shading, which also limits the visibility computation.

Forward plus extends the forward rendering pipeline by adding a light-culling stage before final shading. Basically this pipeline consists of three stages: depth prepass, light culling, and final shading. We will share more about these stages immediately combined with our Vulkan structure. The advantage of this method is that the scene could be rendered with many lights by culling, and storing only lights that contribute to the tile. Definitely a cool technique, right? :)

Now, let us introduce these three stages in our basic forward plus renderer. In our project, since we use Vulkan, we create three command buffers for each single step.

We inplemented this step by creating a pipeline without fragment shader in Vulkan. This enables depth-write and depth test.

As the picture above shows, this will output a depth map, which could be used as an input for light culling stage.

light culling calculates a list of light indices overlapping a tile. In our project, the default tile size is 16 * 16.

As we mentioned above, the depth map generated from the depth prepass stage is used to determine the minimum and maximum depth values within a tile, that is the minimum and maximum depths across the entire tile.

It is noticable that in Vulkan, we add a compute shader for this stage between renderpass one (depth prepass) and renderpass two (final shading).

Lastly, we created another renderpass for final shading, which accumulates all the lights in the light list we calculated for each tile, then we do the final shading based on the results. For loading more materials, we run the pipeline for each material group to enable the full scene of sponza.

UPenn CIS 565 Course Presentation Slide

Please Click Here!!!

Debug Views

RenderHeatmap
Heatmap OnlyDepth Pre-Pass Result

For the heatmap part, if the tile in the image is lighter than other places, that means more lights will effect its bounding frustum.

Frame Breakdown

Performance Analysis

In our analysis, we use two scenes

Sponza scene: 262,267 triangles, 145,186 vertices

Rungholt scene: 5,815,868 triangles, 3,289,722 vertices

Forward VS Forward Plus

NOTE: performance comparison is based on commit e4b440b (Forward+ shading) and de332d2 (forward shading)

As we mentioned above, for the forward renderer, we need to calculate each light for each fragment for the entire scene, which is definitely not a good enough choice.

And for Forward Plus, we only need to consider about the list of light we calculated that overlaps a tile.

Since this comparison is very important for our project and research, we test lots of cases in order to get a accurate test result.

Here we draw a chart to list our results:

-Forward+ ms per frameForward ms per frameForward+ FPSForward FPS
Sponza 10 lights (5.0f)2.715.98369167.22
Sponza 200 lights (5.0f)5.9755.95167.517.9
Sponza 1000 small lights (2.0f)4.91264.82203.663.78
Sponza 1000 lights (5.0f)21.98268.2645.53.73
Sponza 1000 large lights (10.0f)73.23293.0913.663.41
Sponza 20000 small lights (2.0f)54.83Crashed Computer18.23N/A
Rungholt 10 lights (vertex heavy)9.98.72101.01114.68
Rungholt 200 lights11.82122.4384.632.67
Rungholt 1000 lights24.59641.0640.671.56
Rungholt 20000 lights345.82Crashed Computer2.89N/A

We could see a huge performance increasement after we use Forward Plus Renderer.

Different Light Num

First, we choose light num as a variable to dive into forward vs forward plus.

![](documents/Charts/LightNum Compare1.PNG)

For the same sponza scene, if we have the same light radius, let's say 5.0f, we could see that:

Alright, then we choose Rungholt scene as our test scene. The thing happens here is that Rungholt scene is much larger than Sponza scene, and it is vertex heavy. Interesting thing happens! ![](documents/Charts/LightNum Compare2.PNG)

Different Light Radius

As we mentioned above, forward plus renderer time efficiency is also related to light radius.

Here we draw a chart to show the case.

When we are using 1000 large lights in the Sponza scene, from the above chart we could see that:

After this very careful and detailed comparison, WE FEEL SO PROND TO SEE THE PROGRESS WE MADE ABOUT FORWARD PLUS RENDERER!!!!!!

Tile Size

Here we draw a chart to show the differences among different tile sizes.

We do our test using Full Version Sponza scene, with 1000 small lights (radius is 2.0f), and tile capacity is 1023 lights.

It is worth mentioning that with Vulkan, the FPS is really high, for the different tile sizes: 8x8, 16x16, 32x32, 64x64, 128x128, the FPS are correspondingly 147.49, 203.66, 211.57, 184.84, 131.58.

How to choose our default tile size? We know that in Vulkan, if the tile size is too small, it will cause huge amount of computations during the culling process, since the tile is small, and the frustums as a result are a lot more. But if we increase the tile size to some extent, it will definitely cause each thread to do a huge amount of computations than small tile sizes, which is not optimized as well.

We also notice that with different tile sizes, the percentage of the three stages are different. Let us draw a chart to show our test result.

So what we need to do is to choose a balanced tile size that could make both sides fully operated.

Considered all the situations above, we choose 16 x 16 as the best size for our scene :)

Light Per Tile

In this test, we use the scene full sponza, and use 1000 small lights (radius is 2.0f). The tile size is 16x16.

We compare between 63 lights per tile and 1023 lights per tile.

First is the ms per frame comparison:

We can see that in the chart above, when there are 63 lights per tile, we need 4.9 ms for one frame, and for 1023 lights per tile case, we need 4.91 ms for one frame. The difference is not that big!

The second is the SSBO (Shader Storage Buffer Object) Comparison:

We can see that in this comparison, we find big difference. When there are 63 lights per tile, the SSBO size is 2,073,600 bytes. But when there are 1023 lights per tile, the SSBO size is 33,145,200 bytes.

As for graphics card, the memory is not that large, so after this comparison, we find that we better choose small lights per time, which could save a lot of memory, at the same time keep a high FPS.

Install and Build Instructions

Use CMake to build the program.

Download Rungholt model and put in content folder, if you need it.

Windows

  1. Make sure you have Vulkan SDK and Visual Studio 2015 or up, then:
mkdir build
cd build
cmake-gui ..
  1. And Configure(select "Visual Studio 2015 x64"), Generate, then you have Visual Studio project files.

  2. Set vfpr as startup project and build solution

Linux

Make sure VULKAN_SDK is set to x86_64 folder under Vulkan SDK path and you have LD_LIBRARY_PATH and VK_LAYER_PATH set by running source ./setup-env.sh at Vulkan SDK folder, and then

mkdir build
cd build
cmake ..
make

Controls

Pressing RMB and move cursor: rotate camera
W, S, A, D, Q, E: move camera
Z: toggle debug view

Tips

Milestones : How we finish our project step by step :)

Milestone 1 (11/21/2016)

Milestone 2 (11/28/2016)

Milestone 3 (12/12/2016)

Fianl Milestone (12/15/2016)

Third-Party Credits

References

Libraries

Assets

Tools