forked from KhronosGroup/Vulkan-Samples
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Implementing new sample: sparse-image and virtual textures (KhronosGr…
…oup#811) * Initial commit. * MAJOR: removed entirely mesh calculation based on the trigonometric model. Instead, whole mesh is calculated using MVP transformation matrix which makes calculations way more trouble-proof. MINOR: some quick changes, using already written wrappers for some functionalities, and adding some descriptions here and there. * Major changes. Making use of GL_ARB_sparse_texture2 extension in the fragment shader - from now the shader checks if the required fragment is resident in the memory before sampling from it (no explicit calculations required,shader will sample from whatever is ACTUALLY available). On the CPU side, workload was divided into stages, many phases removed/rewritten. * update * Core functionalities are working as expected. To be reviewed. * Minor updates, mostly cleanup * Clang Tidy Check fixes * Clang Format Check fixes * * Rebased with main branch * Fixed run-time errors from validation layers * Changed .md to .adoc style * * Now loading the least detailed mip level is done before the first frame is rendered. * Removed transition_mip_layout() and replaced it with vkb::image_layout_transition(). * Minor bugs and typos. * * Added features from the UI level, now displaying memory-related stats + allow to toggle color highlighting * Added additional uniform buffer containing data needed by fragment shader * Removed call to bind_sparse_image() from free_unused_memory() as it is redundant * Merged UPDATE_AND_GENERATE and FREE_UNUSED_MEMORY stages * Removed virtual_texture.free_list as it is redundant * Replaced virtual_texture.bind_list with to_be_bound flag under the virtual_texture.page_table * Changed available_memory_index_list from std::list to std::vector * Minor bugs / code refactoring * * Fixed the bug that caused black-spots to appear on screen * Block update is now better prioritized to update the closest areas first * Refactoring * * Now memory is allocated dynamically during the run-time * Memory defragmentation can be enabled for the better memory management * * Fixed validation-layer-errors related to deframentation process * From now number of vertical/horizontal blocks can be changed from the UI * Number of blocks processed per-frame can be customized from UI too * Frame-counter mechanism can be enabled/disabled from the UI * Moved sparse_queue to separate (if possible) queue family, other than graphics queue * Introduced synchronization between binding the image and submitting the image * update_and_generate() was reduced to single command_buffer and one-time flush() * Several minor reworks/fixes for better code styling and efficiency * * Updated the screenshot * Put additional descriptions to README.adoc * Minor issues, cleanups & typo's * * Minor changes to match the PR's General & Sample Checklists * * Fixed Ubuntu compilation issues and Copyright checks * Minor typos and reworks * * Minor stylistic changes * * Rebase * unified copyrights * * missing static_cast * * minor rework, fixing a validation layer warning
- Loading branch information
1 parent
2be6f82
commit 8ba886e
Showing
9 changed files
with
2,308 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
# Copyright (c) 2023, Mobica Limited | ||
# | ||
# SPDX-License-Identifier: Apache-2.0 | ||
# | ||
# Licensed under the Apache License, Version 2.0 the "License"; | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
# | ||
|
||
get_filename_component(FOLDER_NAME ${CMAKE_CURRENT_LIST_DIR} NAME) | ||
get_filename_component(PARENT_DIR ${CMAKE_CURRENT_LIST_DIR} PATH) | ||
get_filename_component(CATEGORY_NAME ${PARENT_DIR} NAME) | ||
|
||
add_sample( | ||
ID ${FOLDER_NAME} | ||
CATEGORY ${CATEGORY_NAME} | ||
AUTHOR "Mobica" | ||
NAME "sparse_image" | ||
DESCRIPTION "This sample is showcasing the potential usage of the sparse-image-binding and sparse-image-residency features. It works with the concept of Virtual Textures, allowing textures to be rendered without being entirely allocated in the memory." | ||
SHADER_FILES_GLSL | ||
"sparse_image/sparse.vert" | ||
"sparse_image/sparse.frag") |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,161 @@ | ||
//// | ||
- Copyright (c) 2023, Mobica Limited | ||
- | ||
- SPDX-License-Identifier: Apache-2.0 | ||
- | ||
- Licensed under the Apache License, Version 2.0 the "License"; | ||
- you may not use this file except in compliance with the License. | ||
- You may obtain a copy of the License at | ||
- | ||
- http://www.apache.org/licenses/LICENSE-2.0 | ||
- | ||
- Unless required by applicable law or agreed to in writing, software | ||
- distributed under the License is distributed on an "AS IS" BASIS, | ||
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
- See the License for the specific language governing permissions and | ||
- limitations under the License. | ||
- | ||
//// | ||
== Sparse image | ||
|
||
ifdef::site-gen-antora[] | ||
TIP: The source for this sample can be found in the https://github.com/KhronosGroup/Vulkan-Samples/tree/main/samples/extensions/sparse_image[Khronos Vulkan samples github repository]. | ||
endif::[] | ||
|
||
image::./images/sparse_image_screenshot.png[Sample] | ||
|
||
== Overview | ||
|
||
The usage of | ||
https://registry.khronos.org/vulkan/site/spec/latest/chapters/sparsemem.html[Sparse | ||
Resources] allows for less restrict memory binding in comparison to a | ||
standard resource. | ||
|
||
The key differences between standard and sparse resources, showcased in | ||
this sample are: | ||
|
||
* Sparse resources can be bound non-contiguously to one or more | ||
VkDeviceMemory allocations; | ||
* Sparse resources can be re-bound to different memory allocations over | ||
the lifetime of the resource; | ||
|
||
The sample demonstrates usage of the Sparse Image feature by rendering a | ||
high-resolution texture with only a fraction of the total image size | ||
actually allocated on the device's memory. This is possible by | ||
dynamically loading required memory areas, generating mip levels for | ||
outer parts, removing unused memory and finally: binding an image in | ||
real-time. | ||
|
||
== Enabling features | ||
|
||
There are 3 features to be enabled: | ||
|
||
* sparseBinding; | ||
* sparseResidencyImage2D; | ||
* shaderResourceResidency; | ||
|
||
First two, are the key features required for the usage of the sparse | ||
image resources. The last one - shaderResourceResidency, is required for | ||
the fragment shader to be able to detect which parts of the image are | ||
allocated in the memory. | ||
|
||
[source,c++] | ||
---- | ||
void SparseImage::request_gpu_features(vkb::PhysicalDevice &gpu) | ||
{ | ||
if (gpu.get_features().sparseBinding && gpu.get_features().sparseResidencyImage2D && gpu.get_features().shaderResourceResidency) | ||
{ | ||
gpu.get_mutable_requested_features().sparseBinding = VK_TRUE; | ||
gpu.get_mutable_requested_features().sparseResidencyImage2D = VK_TRUE; | ||
gpu.get_mutable_requested_features().shaderResourceResidency = VK_TRUE; | ||
} | ||
---- | ||
|
||
== Enabling extensions | ||
|
||
There is a single extensions used in this sample: | ||
|
||
* GL_ARB_sparse_texture2; | ||
|
||
This extension is used only by the fragment shader, but requires | ||
shaderResourceResidency feature to be enabled first. What this extension | ||
does, is allowing the fragment to check if the memory for the particular | ||
fragment is actually allocated or not. Because of this extension, it is | ||
possible to keep checking the residency from the fragment shader, and | ||
basically use the most detailed data available. | ||
|
||
[source,glsl] | ||
---- | ||
#extension GL_ARB_sparse_texture2 : enable | ||
---- | ||
|
||
[source,glsl] | ||
---- | ||
for(; (lod <= maxLOD) && !sparseTexelsResidentARB(residencyCode); lod += 1) | ||
{ | ||
residencyCode = sparseTextureLodARB(texSampler, fragTexCoord, lod, color); | ||
} | ||
---- | ||
|
||
|
||
== How is required LOD calculated? | ||
|
||
The whole method is well-described in the source file. In general, the | ||
value of LOD is obtained by calculating: What is the ratio between x or y | ||
movement on the screen, to the u or v movement on the texture? | ||
|
||
The idea is, that when moving pixel-by-pixel along the x or y axis | ||
on-screen, if the small on-screen step causes a significant step | ||
on-texture, then the area is far away from the observer and | ||
a less-detailed mip-level is required. | ||
|
||
The formula used for those calculations is: | ||
|
||
LOD = log2 (max(dT / dx, dT / dy)); where: | ||
|
||
* dT is an on-texture-step in texels, | ||
* dx, dy are on-screen-steps in pixels. | ||
|
||
|
||
== User Interface | ||
|
||
The user can alter the application by using the GUI. | ||
|
||
These are available options: | ||
|
||
* Color highlight - if enabled, areas of a particular LOD usage are | ||
color-highlighted. | ||
* Memory defragmentation - if enabled, memory pages are reallocated from | ||
low-occupied sectors to higher-occupied (but available) sectors to keep the | ||
overall number of allocations as low as possible. | ||
* Update prioritization - if enabled, the application is focused on | ||
processing the most actual requests and discards remainings from the | ||
previous requests. This can be observed when dynamically moving the | ||
camera around. | ||
* Blocks per cycle - describes up to how many blocks can be updated per | ||
a single render cycle. The total number of blocks is defined as: (Vertical | ||
blocks) * (Horizontal blocks). | ||
* Vertical blocks - describes the number of columns the texture is | ||
divided into. | ||
* Horizontal blocks - describes the number of rows the texture is | ||
divided into. | ||
|
||
Additionally, GUI contains memory usage data. It describes (in pages) | ||
what are the virtual requirements (what if the whole image was allocated | ||
in the memory) and what is the actual, current allocation on the | ||
device. | ||
|
||
|
||
== Conclusion | ||
|
||
The primary usage of the sparse image feature is generally speaking | ||
dedicated for cases where too much device's memory is occupied. Keeping | ||
a low-detailed mip-level constantly in the memory and dynamically | ||
loading required areas when the camera changes, is the way to handle | ||
terrain mega-textures. The downside of these solution is that there is a | ||
possibility of a bottleneck problem when constantly transferring | ||
required memory chunks from the CPU to the device. The other downside is | ||
that since the application decides what memory is going to be allocated, | ||
it must take care of the calculations such as: "`what level of detail is | ||
required?`". This creates an unwanted CPU overhead. |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Oops, something went wrong.