Skip to content

Commit

Permalink
Implementing new sample: sparse-image and virtual textures (KhronosGr…
Browse files Browse the repository at this point in the history
…oup#811)

* Initial commit.

* MAJOR: removed entirely mesh calculation based on the trigonometric model. Instead, whole mesh is calculated using MVP transformation matrix which makes calculations way more trouble-proof. MINOR: some quick changes, using already written wrappers for some functionalities, and adding some descriptions here and there.

* Major changes. Making use of GL_ARB_sparse_texture2 extension in the fragment shader - from now the shader checks if the required fragment is resident in the memory before sampling from it (no explicit calculations required,shader will sample from whatever is ACTUALLY available). On the CPU side, workload was divided into stages, many phases removed/rewritten.

* update

* Core functionalities are working as expected. To be reviewed.

* Minor updates, mostly cleanup

* Clang Tidy Check fixes

* Clang Format Check fixes

* * Rebased with main branch
* Fixed run-time errors from validation layers
* Changed .md to .adoc style

* * Now loading the least detailed mip level is done before the first frame is rendered.
* Removed transition_mip_layout() and replaced it with vkb::image_layout_transition().
* Minor bugs and typos.

* * Added features from the UI level, now displaying memory-related stats + allow to toggle color highlighting
* Added additional uniform buffer containing data needed by fragment shader
* Removed call to bind_sparse_image() from free_unused_memory() as it is redundant
* Merged UPDATE_AND_GENERATE and FREE_UNUSED_MEMORY stages
* Removed virtual_texture.free_list as it is redundant
* Replaced virtual_texture.bind_list with to_be_bound flag under the virtual_texture.page_table
* Changed available_memory_index_list from std::list to std::vector
* Minor bugs / code refactoring

* * Fixed the bug that caused black-spots to appear on screen
* Block update is now better prioritized to update the closest areas first
* Refactoring

* * Now memory is allocated dynamically during the run-time
* Memory defragmentation can be enabled for the better memory management

* * Fixed validation-layer-errors related to deframentation process
* From now number of vertical/horizontal blocks can be changed from the UI
* Number of blocks processed per-frame can be customized from UI too
* Frame-counter mechanism can be enabled/disabled from the UI
* Moved sparse_queue to separate (if possible) queue family, other than graphics queue
* Introduced synchronization between binding the image and submitting the image
* update_and_generate() was reduced to single command_buffer and one-time flush()
* Several minor reworks/fixes for better code styling and efficiency

* * Updated the screenshot
* Put additional descriptions to README.adoc
* Minor issues, cleanups & typo's

* * Minor changes to match the PR's General & Sample Checklists

* * Fixed Ubuntu compilation issues and Copyright checks
* Minor typos and reworks

* * Minor stylistic changes

* * Rebase
* unified copyrights

* * missing static_cast

* * minor rework, fixing a validation layer warning
  • Loading branch information
Grzegorz-Smagacz-Mobica authored Dec 15, 2023
1 parent 2be6f82 commit 8ba886e
Show file tree
Hide file tree
Showing 9 changed files with 2,308 additions and 0 deletions.
1 change: 1 addition & 0 deletions antora/modules/ROOT/nav.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,7 @@
** xref:samples/extensions/ray_queries/README.adoc[Ray queries]
** xref:samples/extensions/ray_tracing_reflection/README.adoc[Ray tracing reflection]
** xref:samples/extensions/shader_object/README.adoc[Shader Object]
** xref:samples/extensions/sparse_image/README.adoc[Sparse Image]
** xref:samples/extensions/synchronization_2/README.adoc[Synchronization 2]
** xref:samples/extensions/timeline_semaphore/README.adoc[Timeline semaphore]
** xref:samples/extensions/vertex_dynamic_state/README.adoc[Vertex dynamic state]
Expand Down
1 change: 1 addition & 0 deletions samples/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,7 @@ set(ORDER_LIST
"fragment_shader_barycentric"
"gshader_to_mshader"
"color_write_enable"
"sparse_image"

#Performance Samples
"swapchain_images"
Expand Down
30 changes: 30 additions & 0 deletions samples/extensions/sparse_image/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Copyright (c) 2023, Mobica Limited
#
# SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 the "License";
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

get_filename_component(FOLDER_NAME ${CMAKE_CURRENT_LIST_DIR} NAME)
get_filename_component(PARENT_DIR ${CMAKE_CURRENT_LIST_DIR} PATH)
get_filename_component(CATEGORY_NAME ${PARENT_DIR} NAME)

add_sample(
ID ${FOLDER_NAME}
CATEGORY ${CATEGORY_NAME}
AUTHOR "Mobica"
NAME "sparse_image"
DESCRIPTION "This sample is showcasing the potential usage of the sparse-image-binding and sparse-image-residency features. It works with the concept of Virtual Textures, allowing textures to be rendered without being entirely allocated in the memory."
SHADER_FILES_GLSL
"sparse_image/sparse.vert"
"sparse_image/sparse.frag")
161 changes: 161 additions & 0 deletions samples/extensions/sparse_image/README.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,161 @@
////
- Copyright (c) 2023, Mobica Limited
-
- SPDX-License-Identifier: Apache-2.0
-
- Licensed under the Apache License, Version 2.0 the "License";
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License.
-
////
== Sparse image

ifdef::site-gen-antora[]
TIP: The source for this sample can be found in the https://github.com/KhronosGroup/Vulkan-Samples/tree/main/samples/extensions/sparse_image[Khronos Vulkan samples github repository].
endif::[]

image::./images/sparse_image_screenshot.png[Sample]

== Overview

The usage of
https://registry.khronos.org/vulkan/site/spec/latest/chapters/sparsemem.html[Sparse
Resources] allows for less restrict memory binding in comparison to a
standard resource.

The key differences between standard and sparse resources, showcased in
this sample are:

* Sparse resources can be bound non-contiguously to one or more
VkDeviceMemory allocations;
* Sparse resources can be re-bound to different memory allocations over
the lifetime of the resource;

The sample demonstrates usage of the Sparse Image feature by rendering a
high-resolution texture with only a fraction of the total image size
actually allocated on the device's memory. This is possible by
dynamically loading required memory areas, generating mip levels for
outer parts, removing unused memory and finally: binding an image in
real-time.

== Enabling features

There are 3 features to be enabled:

* sparseBinding;
* sparseResidencyImage2D;
* shaderResourceResidency;

First two, are the key features required for the usage of the sparse
image resources. The last one - shaderResourceResidency, is required for
the fragment shader to be able to detect which parts of the image are
allocated in the memory.

[source,c++]
----
void SparseImage::request_gpu_features(vkb::PhysicalDevice &gpu)
{
if (gpu.get_features().sparseBinding && gpu.get_features().sparseResidencyImage2D && gpu.get_features().shaderResourceResidency)
{
gpu.get_mutable_requested_features().sparseBinding = VK_TRUE;
gpu.get_mutable_requested_features().sparseResidencyImage2D = VK_TRUE;
gpu.get_mutable_requested_features().shaderResourceResidency = VK_TRUE;
}
----

== Enabling extensions

There is a single extensions used in this sample:

* GL_ARB_sparse_texture2;

This extension is used only by the fragment shader, but requires
shaderResourceResidency feature to be enabled first. What this extension
does, is allowing the fragment to check if the memory for the particular
fragment is actually allocated or not. Because of this extension, it is
possible to keep checking the residency from the fragment shader, and
basically use the most detailed data available.

[source,glsl]
----
#extension GL_ARB_sparse_texture2 : enable
----

[source,glsl]
----
for(; (lod <= maxLOD) && !sparseTexelsResidentARB(residencyCode); lod += 1)
{
residencyCode = sparseTextureLodARB(texSampler, fragTexCoord, lod, color);
}
----


== How is required LOD calculated?

The whole method is well-described in the source file. In general, the
value of LOD is obtained by calculating: What is the ratio between x or y
movement on the screen, to the u or v movement on the texture?

The idea is, that when moving pixel-by-pixel along the x or y axis
on-screen, if the small on-screen step causes a significant step
on-texture, then the area is far away from the observer and
a less-detailed mip-level is required.

The formula used for those calculations is:

LOD = log2 (max(dT / dx, dT / dy)); where:

* dT is an on-texture-step in texels,
* dx, dy are on-screen-steps in pixels.


== User Interface

The user can alter the application by using the GUI.

These are available options:

* Color highlight - if enabled, areas of a particular LOD usage are
color-highlighted.
* Memory defragmentation - if enabled, memory pages are reallocated from
low-occupied sectors to higher-occupied (but available) sectors to keep the
overall number of allocations as low as possible.
* Update prioritization - if enabled, the application is focused on
processing the most actual requests and discards remainings from the
previous requests. This can be observed when dynamically moving the
camera around.
* Blocks per cycle - describes up to how many blocks can be updated per
a single render cycle. The total number of blocks is defined as: (Vertical
blocks) * (Horizontal blocks).
* Vertical blocks - describes the number of columns the texture is
divided into.
* Horizontal blocks - describes the number of rows the texture is
divided into.

Additionally, GUI contains memory usage data. It describes (in pages)
what are the virtual requirements (what if the whole image was allocated
in the memory) and what is the actual, current allocation on the
device.


== Conclusion

The primary usage of the sparse image feature is generally speaking
dedicated for cases where too much device's memory is occupied. Keeping
a low-detailed mip-level constantly in the memory and dynamically
loading required areas when the camera changes, is the way to handle
terrain mega-textures. The downside of these solution is that there is a
possibility of a bottleneck problem when constantly transferring
required memory chunks from the CPU to the device. The other downside is
that since the application decides what memory is going to be allocated,
it must take care of the calculations such as: "`what level of detail is
required?`". This creates an unwanted CPU overhead.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit 8ba886e

Please sign in to comment.