Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First pass at performance testing #1198

Merged
merged 112 commits into from
Oct 16, 2023
Merged
Show file tree
Hide file tree
Changes from 86 commits
Commits
Show all changes
112 commits
Select commit Hold shift + click to select a range
371e81c
Add new functional test stub
csciguy8 Aug 23, 2023
b9fb0f3
Run clang format
csciguy8 Aug 23, 2023
ccbf5b0
Fix non-unity build errors
csciguy8 Aug 23, 2023
4db3031
Set up programmatic scene with typical Cesium objects
csciguy8 Aug 23, 2023
e4d636c
Fix camera not looking at correct place when entering PIE mode
csciguy8 Aug 23, 2023
7cad869
Add option to create test in editor or in standalone world
csciguy8 Aug 23, 2023
6ec8d79
Fix formatting
csciguy8 Aug 23, 2023
a89e5a9
Setup denver (ion) example. Steps towards ticking world after objects…
csciguy8 Aug 24, 2023
459ef0e
Fixup issues with Denver test
csciguy8 Aug 25, 2023
9db4b93
Refactor and add logic to stop reset tilesets before loop start
csciguy8 Aug 25, 2023
e0deda6
Refactor common vars into context class. Add structure to loop until …
csciguy8 Aug 25, 2023
e647f08
Add additional camera manager for updates. Add debug logging
csciguy8 Aug 25, 2023
24d211d
First iteration of working test that ticks correctly
csciguy8 Aug 28, 2023
e48572d
Refactor suspend update calls (remove duplication)
csciguy8 Aug 28, 2023
c89f0ca
Separate tests into separate namespaces (unit, performance)
csciguy8 Aug 29, 2023
20c81ee
Refine logging messages for clarity
csciguy8 Aug 29, 2023
43cf257
Implement two tests for demonstration (Denver, Googleplex)
csciguy8 Aug 29, 2023
12b9c6d
Remove debug and test code
csciguy8 Aug 29, 2023
fd2da60
Add code to let UI tick when running tests
csciguy8 Aug 29, 2023
bbd61c0
Tweaks to performance testing framework.
kring Aug 31, 2023
7614bb3
Add logic to start / end play session
csciguy8 Aug 31, 2023
4dd0060
Unfreeze updates when test ends, so we can see something
csciguy8 Aug 31, 2023
b9f594f
Update ion token for test (sync with default in samples project)
csciguy8 Sep 1, 2023
9c9ef24
Refactor scene setup code into separate files
csciguy8 Sep 1, 2023
b642e17
Add missing control path
csciguy8 Sep 5, 2023
291ac14
Rework tick logic to use latent commands
csciguy8 Sep 5, 2023
2fff0b4
Let tests define separable passes for setup / verification
csciguy8 Sep 5, 2023
9942bab
Remove extraneous include
csciguy8 Sep 5, 2023
7562b49
Add missing ifdef to fix non-Windows50 builds
csciguy8 Sep 5, 2023
014ff80
Create performance-profiling.md
csciguy8 Sep 6, 2023
0cc5214
Update performance-profiling.md
csciguy8 Sep 6, 2023
6589f08
Update performance-profiling.md
csciguy8 Sep 6, 2023
d16df29
Add more to interpretation section
csciguy8 Sep 6, 2023
8479aa1
Finish up conclusions section. Misc revisions
csciguy8 Sep 6, 2023
89d8081
Rename performance-profiling.md to performance-profiling-with-cpu-usa…
csciguy8 Sep 7, 2023
9b8292b
Create performance-profiling-with-unreal-insights
csciguy8 Sep 7, 2023
1db3309
Rename performance-profiling-with-unreal-insights to performance-prof…
csciguy8 Sep 7, 2023
9fa6019
Stub out major sections
csciguy8 Sep 7, 2023
6a8a2ba
Flesh out prepare for capture section
csciguy8 Sep 7, 2023
705c628
Merge branch 'ue5-main' into simple-performance-test
csciguy8 Sep 8, 2023
98397dd
Merge branch 'simple-performance-test' into performance-test-tweaks
csciguy8 Sep 8, 2023
4fc27ac
Flesh out timing capture sections (add pics)
csciguy8 Sep 8, 2023
637504e
Add first part of Interpret section
csciguy8 Sep 8, 2023
1079c0c
Flesh out Examine low use areas section
csciguy8 Sep 8, 2023
e544bb3
Finish the last sections
csciguy8 Sep 8, 2023
5586afa
Create performance-profiling-setup-test.md
csciguy8 Sep 8, 2023
1240f52
Link to separate setup page
csciguy8 Sep 8, 2023
46b8fcc
Link to separate setup page
csciguy8 Sep 8, 2023
264fbe6
Remove duplicate wording
csciguy8 Sep 8, 2023
8a13f86
Merge branch 'simple-performance-test' into profiling-documentation
csciguy8 Sep 8, 2023
d92e84f
Fix non-unity build (broke this again, oops)
csciguy8 Sep 11, 2023
61feff1
Add cold / warm cache tests + test names to logging marks
csciguy8 Sep 11, 2023
a6a353b
Add runtime access to cache database. Let load tests reset cache for …
csciguy8 Sep 11, 2023
b1f22d8
Move clear db cache step to initial test setup
csciguy8 Sep 11, 2023
7a8b725
Fix formatting
csciguy8 Sep 12, 2023
e73fee9
Add trace events before calling updateView, updateViewOffline
csciguy8 Sep 18, 2023
fb130b1
Add wait before exiting play mode
csciguy8 Sep 18, 2023
a6a7b34
Merge branch 'ue5-main' into simple-performance-test
csciguy8 Sep 19, 2023
26a9b7f
Fix for non-unity build
csciguy8 Sep 19, 2023
1f6ecc7
Update to renamed georeference functions
csciguy8 Sep 19, 2023
4366558
Merge branch 'simple-performance-test' into performance-test-tweaks
csciguy8 Sep 19, 2023
7c93ae0
Update to renamed georeference functions
csciguy8 Sep 19, 2023
a3fd233
Update test to Cesium.Unit.X namespace
csciguy8 Sep 19, 2023
7af3191
Add note about building Release and RelWithDebInfo configs
csciguy8 Sep 20, 2023
bcb30e7
Add note about building release with debug symbols
csciguy8 Sep 20, 2023
748e19a
Disable world bounds checks for test world
csciguy8 Sep 21, 2023
56ed677
Add missing include header
csciguy8 Sep 25, 2023
741e9a0
Merge branch 'simple-performance-test' into performance-test-tweaks
csciguy8 Sep 25, 2023
bfac249
Merge pull request #1200 from CesiumGS/performance-test-tweaks
csciguy8 Sep 25, 2023
a075071
Tweak LoadProgress
csciguy8 Sep 25, 2023
facf08d
Merge pull request #1210 from CesiumGS/profiling-documentation
csciguy8 Sep 26, 2023
868c71a
Move final cleanup of setSuspendUpdate to completion of test
csciguy8 Sep 27, 2023
cb1431f
Fix start mark encompassing PIE startup time
csciguy8 Oct 3, 2023
c921427
Let tests run with predetermined viewport sizes
csciguy8 Oct 3, 2023
a692f42
Break up code, tests separate from core logic
csciguy8 Oct 4, 2023
df50be9
Move google tiles key next to ion key
csciguy8 Oct 4, 2023
2b37835
Fix unity build (header include)
csciguy8 Oct 4, 2023
2bf0b3b
Let tests specify viewport sizes
csciguy8 Oct 4, 2023
b6df20e
Refactor away some duplicate code when creating locations
csciguy8 Oct 4, 2023
c603ce0
Initial drop
csciguy8 Oct 4, 2023
75a7a9b
Let editor viewport show the same camera orientation as the tests
csciguy8 Oct 5, 2023
ade11b9
Merge branch 'simple-performance-test' into more-google-tiles-tests
csciguy8 Oct 5, 2023
a8091d9
Add sunSky to the scene generation context (so we can adjust time zone)
csciguy8 Oct 5, 2023
2ff62b0
Merge branch 'simple-performance-test' into more-google-tiles-tests
csciguy8 Oct 5, 2023
58f04d5
Set location time zones. Batch tests under GoogleTiles.X namespace
csciguy8 Oct 5, 2023
934e803
Fix camera FOV not being set for generated scene
csciguy8 Oct 6, 2023
3750f99
Merge branch 'simple-performance-test' into more-google-tiles-tests
csciguy8 Oct 6, 2023
eaa28ee
Fixup heading
csciguy8 Oct 6, 2023
2cf8dc8
Add automatic screenshots for every test pass
csciguy8 Oct 6, 2023
fbe5651
Fix formatting
csciguy8 Oct 6, 2023
91ed29d
Merge branch 'simple-performance-test' into more-google-tiles-tests
csciguy8 Oct 6, 2023
e3a4362
Use pretty names for logging. Get rid of warm cache tests
csciguy8 Oct 6, 2023
aa14d79
Add missing header
csciguy8 Oct 6, 2023
aaccba4
Bump test timeout to 30 secs
csciguy8 Oct 6, 2023
8eae954
Merge branch 'simple-performance-test' into more-google-tiles-tests
csciguy8 Oct 6, 2023
9a8a53b
Fix starting orientations
csciguy8 Oct 6, 2023
9d622f2
Merge branch 'ue5-main' into simple-performance-test
csciguy8 Oct 16, 2023
4e22ffd
Move new unit tests to Cesium.Unit.X namespace
csciguy8 Oct 16, 2023
d3d61f0
Remove google tests (moving to different branch)
csciguy8 Oct 16, 2023
2564e9a
Merge branch 'simple-performance-test' into more-google-tiles-tests
csciguy8 Oct 16, 2023
349aca5
Add googleplex test back in
csciguy8 Oct 16, 2023
f52722d
Update google key to latest from samples project
csciguy8 Oct 16, 2023
0b3e828
Update release-process.md for API keys
csciguy8 Oct 16, 2023
a76fb1c
Avoid unnecessary copy of ViewUpdateResult
csciguy8 Oct 16, 2023
079819c
Move location setup to samples .cpp
csciguy8 Oct 16, 2023
502105d
Update release-process.md
csciguy8 Oct 16, 2023
8324a89
Merge branch 'simple-performance-test' into more-google-tiles-tests
csciguy8 Oct 16, 2023
018cd00
Update ion token
csciguy8 Oct 16, 2023
962d4b2
result -> pResult
kring Oct 16, 2023
e288dc3
Add warm cache tests
csciguy8 Oct 16, 2023
60e28d2
Add #define for test window size (remove duplication)
csciguy8 Oct 16, 2023
7b47a8b
Merge pull request #1249 from CesiumGS/more-google-tiles-tests
kring Oct 16, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/testWindows.yml
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ jobs:
timeout-minutes: 60
run: |
cd "${{ inputs.unreal-binaries-path }}"
./UnrealEditor-Cmd.exe "$env:TESTS_PROJECT_ROOT/TestsProject.uproject" -execcmds="Automation RunTests Cesium.;quit" -nullrhi -unattended -nosplash -ReportExportPath="$env:TESTS_PROJECT_LOGS"
csciguy8 marked this conversation as resolved.
Show resolved Hide resolved
./UnrealEditor-Cmd.exe "$env:TESTS_PROJECT_ROOT/TestsProject.uproject" -execcmds="Automation RunTests Cesium.Unit.;quit" -nullrhi -unattended -nosplash -ReportExportPath="$env:TESTS_PROJECT_LOGS"
- name: Display tests log
if: always()
run: |
Expand Down
1 change: 1 addition & 0 deletions Documentation/developer-setup-windows.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,7 @@ To build a "Release" build of cesium-native,
- Add a new configuration by clicking the `+` and choose `x64-Release`.
- Select the new "x64-Release" from the Solution Configuration dropdown.
- Right-click on `CMakeLists.txt` again and choose "Install".
> In Visual Studio, this defaults to the "RelWithDebInfo" configuration type. You can change this at any time

## Visual Studio Code

Expand Down
29 changes: 29 additions & 0 deletions Documentation/performance-profiling-setup-test.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# Set up a repeatable test

We need an area of code to execute repeatedly, with as many variables locked down as possible.


### Set up Unreal
1) Open Unreal Editor (UnrealEditor.exe)
2) Create a blank map (project doesn't matter. Choose an existing one or create a new one)
3) Go to Edit->Plugins
4) Search for "Functional Testing plugin". Check it to enable it
![smaller](https://github.com/CesiumGS/cesium-unreal/assets/130494071/5a3bc9de-cdaf-4d9d-842d-104719426663)
5) Save all
6) Set this map as the 'Editor Startup Map' so it loads when starting from Visual Studio
![smaller 2](https://github.com/CesiumGS/cesium-unreal/assets/130494071/8ba5c6c2-8c97-4048-afe2-db74770d85cc)


### Build Release Code

We need to make sure all our C++ code is building in release mode, preferably with debug symbols.

> This assumes that you have already built your code successfully and are familiar with the concepts from our [developer setup guide](https://github.com/CesiumGS/cesium-unreal/blob/ue5-main/Documentation/developer-setup-windows.md). Although you could profile a debug build, it is typically more useful to build in release, since this is how a game is usually packaged.

1) If building the cesium-native library, make sure you are using a release configuration derived from "RelWithDebInfo"
2) Open your Unreal project's Visual Studio solution (.sln). This example uses the solution generated from [cesium-unreal-samples](https://github.com/CesiumGS/cesium-unreal-samples)
3) Choose "Development Editor"

![smaller 3](https://github.com/CesiumGS/cesium-unreal/assets/130494071/0e70065f-c717-466b-a92b-cab1dcfdd29b)

4) From the menu, choose Build -> Build Solution
83 changes: 83 additions & 0 deletions Documentation/performance-profiling-with-cpu-usage.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@

This guide will help you find performance problems in your C++ code using the [CPU Usage tool](https://learn.microsoft.com/en-us/visualstudio/profiling/beginners-guide-to-performance-profiling?view=vs-2022) included in Visual Studio's Diagnostic tools window.

The CPU Usage tool is easy to set up with minimal impact on how your app is built or how it runs. If you use Visual Studio often, you may have this running already. This is a sampling-based profiler, with pros and cons detailed [here](https://learn.microsoft.com/en-us/visualstudio/profiling/understanding-performance-collection-methods-perf-profiler?view=vs-2022).

# Set up a repeatable test

In this example, we will use our Cesium performance tests. Follow the steps outlined [here](https://github.com/CesiumGS/cesium-unreal/blob/profiling-documentation/Documentation/performance-profiling-setup-test.md).

# Prepare for capture

### Visual Studio

1) Open your project's Visual Studio solution (.sln). This example uses the solution generated from [cesium-unreal-samples](https://github.com/CesiumGS/cesium-unreal-samples)
2) From the menu, choose Debug->Windows->Show Diagnostic Tools
3) Configure it. Uncheck 'Memory Usage'. Under Settings, Uncheck "Enable CPU Profiling", we'll turn this back on later.

<img width="484" alt="DiagSetup" src="https://github.com/CesiumGS/cesium-unreal/assets/130494071/798d794c-19f0-4f15-93e1-1815e3f1e75b">

4) Optionally, find two places in your code to set breakpoints. In our example, performance test start / end marks are perfect.

![Breakpoint Set small](https://github.com/CesiumGS/cesium-unreal/assets/130494071/5a793b9c-fd68-42ed-96ae-6ec884c38951)

>We could profile the entire debugging session if we needed to. But it's generally good practice to reduce your timing capture as much as possible. This can improve responsiveness when using resource intensive profiling tools, like memory tracking.

# Run the timing capture session

1) From Visual Studio, start your debugging session (Debug->Start Debugging, F5)
2) Find the performance tests in Unreal. Tools->Test Automation
![Automation Window small](https://github.com/CesiumGS/cesium-unreal/assets/130494071/d27e7d67-3658-4cb2-ab10-777498cba0da)

3) Check "LoadTestDenver"
4) Click "Start Tests"
5) Your first break point should hit in Visual Studio
6) Go back to the Diagnostic Tools window, click on "Record CPU Profile". It should turn red.

![image](https://github.com/CesiumGS/cesium-unreal/assets/130494071/ce0c7e86-c1ef-4a01-97fd-c97275b6f62b)

7) Continue the debugging session (Debug->Continue, F5)
8) Your second break point should hit
9) Go back to the Diagnostic Tools window, you should now see a report

# Interpret the report

![image](https://github.com/CesiumGS/cesium-unreal/assets/130494071/a9fb3e0b-86f5-4239-b4ab-c7f9b1dba4a5)

This can be a bit daunting at first, but most profiling tools have a similar workflow.

### Start at the timeline

![image](https://github.com/CesiumGS/cesium-unreal/assets/130494071/da733adc-6cae-4c89-8a6c-01a367667a0d)

Note the highlighted area to the right where the CPU usage spikes. This corresponds to the breakpoints that we set.

All data from the report will reflect this selection only.

### Trace calls with high usage

From the main window, click on "Open Details"
![image](https://github.com/CesiumGS/cesium-unreal/assets/130494071/f34b5ee3-15b7-485a-a90a-8f71310b1b44)

The CPU Usage window will appear. Set "Current View" to "Functions", then find the 'Self CPU' column and sort descending (down arrow).

![image](https://github.com/CesiumGS/cesium-unreal/assets/130494071/20836178-8337-4d53-be58-f388db905f9f)

This window now shows the functions that have the highest time spent within themselves only. Useful for finding individual functions that are called often or need to be optimized.

In this example, `stbir_resample_horizontal_downsample` is of particular interest because it's in the code base we built. Entries with `[External]` or originate from an unfamiliar module are generally ignored, although it is useful to know we are calling into them.

Right click on the `stbir_resample_horizontal_downsample` row, select "View in Call Tree".

![image](https://github.com/CesiumGS/cesium-unreal/assets/130494071/e5a88c06-5b76-4a07-83be-db5147a961b2)

# Draw conclusions

The window above is starting to show some actionable information:
- 20% of the sampled CPU time was spent in `CesiumTextureUtility::loadTextureAnyThreadPart`. Basically, we're loading textures
- There are a variety of `stbir_XXX functions` that are taking the bulk of the time, and might be candidates for optimization
- The highest cost single function is `stbir_resample_horizontal_downsample`, but keep in mind the limits of a sampling profiler. We don't know how many times it was called, just that it was being executed ~6% of the time.

Are these functions worth investigating and potentially optimizing? Maybe. Again, know this is a sampling profiler. Even if you optimize the highest cost function to 0.001%, you are only improving CPU efficiency.

If your goal is to reach absolute numbers, like specific loading times, or frames per second, you may need another type of profiling tool.
108 changes: 108 additions & 0 deletions Documentation/performance-profiling-with-unreal-insights.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
This guide will help you find performance problems in your C++ code using [Unreal Insights](https://docs.unrealengine.com/5.0/en-US/unreal-insights-in-unreal-engine/), included with Unreal Engine.

Unreal Insights can display the scope of timing events as well as activity across threads. There is minimal impact to app execution, and you can set up your own custom events. It provides more functionality than an exclusive [CPU sampling-based profiler](https://learn.microsoft.com/en-us/visualstudio/profiling/understanding-performance-collection-methods-perf-profiler?view=vs-2022), although both tools can complement each other.

# Set up a repeatable test

In this example, we will use our Cesium performance tests. Follow the steps outlined [here](https://github.com/CesiumGS/cesium-unreal/blob/profiling-documentation/Documentation/performance-profiling-setup-test.md).

# Prepare for capture

### Unreal Editor
1) In Visual Studio, click Debug -> Start Debugging (F5)
2) In Unreal, click Tools->Test Automation
3) Check the Cesium.Performance.LoadTestDenver row (don't start the test yet)
![Automation Window small](https://github.com/CesiumGS/cesium-unreal/assets/130494071/d27e7d67-3658-4cb2-ab10-777498cba0da)
4) Click Tools->Run Unreal Insights
5) In Unreal Insights, click on the "Connection" tab (don't connect yet)

> You can also find UnrealInsights.exe in UE_5.X\Engine\Binaries\Win64

![image](https://github.com/CesiumGS/cesium-unreal/assets/130494071/eadd4013-ca10-4b61-bb7d-0ab233440a39)

# Run the timing capture session
1) In Unreal Insights, click "Connect"
2) In Unreal Editor, click "Start Tests" (you should already have the Test Automation window open)
3) When the test ends, close Unreal Editor. We don't need it anymore.
4) In Unreal Insights, click the Trace Store tab, notice the trace that was just created
5) Click on it, then click on the 'Open Trace' button

> On the right side, there's a "Explore Trace Store Directory" button. You can click on this to delete or organize your traces

![image](https://github.com/CesiumGS/cesium-unreal/assets/130494071/f1e34fbc-35cd-4bc3-b935-5e322f5d9ba6)

# Interpret the report

![image](https://github.com/CesiumGS/cesium-unreal/assets/130494071/9cab7cf1-ab6d-4b58-a362-fc21ccff0334)

By default, the Timings Insights Tab is shown. More detail can be found [here](https://docs.unrealengine.com/5.0/en-US/timing-insights-in-unreal-engine-5/).

For this session, there are several sections of interest for us:
- The Frames panel (top, a timeline view)
- The Timings panel (middle, mostly empty because nothing is selected)
- The Log Panel (bottom)
- The Timers tab (right)

### Isolate your area of interest

1) In the Log Panel, search for "mark". This will show the logging of our timing marks for our test. Select the start mark, then hold shift and down arrow to select the end mark too
![image](https://github.com/CesiumGS/cesium-unreal/assets/130494071/7cccc075-edf7-4b8e-b704-b9efc9de1a3c)

2) Notice that the Timings panel is now displaying timing data, with a specific time region highlighted
3) In the Timings panel, select View Mode -> Compact Mode to see more of a bird's eye view
4) Select All Tracks and uncheck the following threads that don't have much activity for our test: ```RenderThread 3-7, BackgroundThreadPool #1, ForegroundWorker #0-#1, DDC IO ThreadPool #0-#2, Reserve Worker #0-#13, AudioMixerXXX```
5) Use the mouse wheel to zoom in to selected region. Right click and drag to pan left and right.

The view should be a lot cleaner now
![image](https://github.com/CesiumGS/cesium-unreal/assets/130494071/aca0680e-3dc3-4d23-9838-8f598f384089)


### Examine high traffic timers

Let's look at the Timers tab.

![image](https://github.com/CesiumGS/cesium-unreal/assets/130494071/823fc4d4-25d3-40dc-9b41-1cffee560454)

Every row is a timing event. Some events come from the engine, some are custom timers in the Cesium for Unreal plugin code. You'll notice that Incl is sorting descending, showing the events with the highest inclusive time.

> You may feel the need to jump right in to `Cesium::CreateRHITexture2D`. It seems to have one of the highest exclusive times (Excl) of any of the events in the list, 1 second. After all, our selection is only 1.2 seconds long, so this must be the performance bottleneck right? Hold on. The total sampled time at the top (CPU) is 19.8s, indicating the times are the total sampled times across threads, not absolute session duration.

Given that the sampled time of the highest cost calls are actually somewhat small compared to the total sampled CPU time, our bottleneck is most likely outside of our timed events.

This brings us to...


### Examine low use areas

1) Go back to the Timings panel.
2) In All Tracks, check Game Frames
2) Turn off compact mode by unchecking "View Mode->Compact Mode".
3) In View Mode, set "Depth Limit" to "4 lanes"
4) Zoom and pan to an area of the selection where the background workers haven't started loading yet

![image](https://github.com/CesiumGS/cesium-unreal/assets/130494071/caa47e66-b088-46d8-9aa0-1916a65777de)

The selected area is the first phase of the loading test. This is a region between when the start mark was logged until when background workers start loading models.

It lasts about 8 game frames, or 388 ms, and does not seem to be making use of background threads at all. Could be something to investigate.


### Examine fragmented use areas

1) Find the Timings panel
2) In View Mode, set "Depth Limit" to "Unlimited"
3) Zoom a bit into an area where our background workers are very busy

![image](https://github.com/CesiumGS/cesium-unreal/assets/130494071/29d7c3a2-3710-4a2b-a4f1-09050bdb9287)

This selected area is zoomed in enough to see that the background workers are all calling the same functions. They finish their work, then wait for more work to be available. Some of this work seems to take longer than others, especially at the beginning.

Note the gaps between the work. In general, there seems to be more inactivity than activity during this timeframe. Ideally, we would like to see all work squished together, with no waits in between. Improvements like this should bring the total execution duration lower. In this case, total load time.

# Draw conclusions

We've identified some actionable information so far, even if it only leads to investigation:
* There is a 388 ms low use area at the beginning of the test (~30%). What is happening here? Can it be faster?
* During high use areas, background threads seems to inactive more than they are active. Why? Can this be optimized so they are always active?

It's very common for profiling to be an iterative process. The result of a profiling session could easily be just adding more event timers, or digging deeper into how something works. Before we can expect that code change that results in a heroic 10x speedup, we need to be able to see clearly what is going on.
1 change: 1 addition & 0 deletions Source/CesiumEditor/Private/CesiumEditorSubLevelMutex.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
#include "CesiumSubLevelComponent.h"
#include "CesiumSubLevelSwitcherComponent.h"
#include "Components/ActorComponent.h"
#include "Engine/World.h"
csciguy8 marked this conversation as resolved.
Show resolved Hide resolved
#include "LevelInstance/LevelInstanceActor.h"

CesiumEditorSubLevelMutex::CesiumEditorSubLevelMutex() {
Expand Down
65 changes: 43 additions & 22 deletions Source/CesiumRuntime/Private/Cesium3DTileset.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -881,27 +881,41 @@ class UnrealResourcePreparer
void ACesium3DTileset::UpdateLoadStatus() {
TRACE_CPUPROFILER_EVENT_SCOPE(Cesium::UpdateLoadStatus)

this->LoadProgress = this->_pTileset->computeLoadProgress();
float nativeLoadProgress = this->_pTileset->computeLoadProgress();

if (this->LoadProgress < 100 ||
this->_lastTilesWaitingForOcclusionResults > 0) {
this->_activeLoading = true;
} else if (this->_activeLoading && this->LoadProgress == 100) {
// If native tileset still loading, just copy its progress
if (nativeLoadProgress < 100) {
this->LoadProgress = nativeLoadProgress;
return;
}

// Native tileset is 100% loaded, but there might be a few frames where
// nothing needs to be loaded as we are waiting for occlusion results to come
// back, which means we are not done with loading all the tiles in the tileset
// yet. Interpret this as 99% (almost) done
if (this->_lastTilesWaitingForOcclusionResults > 0) {
this->LoadProgress = 99;
return;
}

// There might be a few frames where nothing needs to be loaded as we
// are waiting for occlusion results to come back, which means we are not
// done with loading all the tiles in the tileset yet.
if (this->_lastTilesWaitingForOcclusionResults == 0) {
TRACE_CPUPROFILER_EVENT_SCOPE(Cesium::BroadcastOnTilesetLoaded)
// If we have tiles to hide next frame, we haven't completely finished loading
// yet. We need to tick once more. We're really close to done.
if (!this->_tilesToHideNextFrame.empty()) {
this->LoadProgress = glm::min(this->LoadProgress, 99.9999f);
return;
}

// Tileset just finished loading, we broadcast the update
UE_LOG(LogCesium, Verbose, TEXT("Broadcasting OnTileLoaded"));
OnTilesetLoaded.Broadcast();
// We can now report 100 percent loaded
float lastLoadProgress = this->LoadProgress;
this->LoadProgress = 100;

// Tileset remains 100% loaded if we don't have to reload it
// so we don't want to keep on sending finished loading updates
this->_activeLoading = false;
}
// Only broadcast the update when we first hit 100%, not everytime
if (lastLoadProgress != LoadProgress) {
TRACE_CPUPROFILER_EVENT_SCOPE(Cesium::BroadcastOnTilesetLoaded)

// Tileset just finished loading, we broadcast the update
UE_LOG(LogCesium, Verbose, TEXT("Broadcasting OnTileLoaded"));
OnTilesetLoaded.Broadcast();
}
}

Expand Down Expand Up @@ -1013,6 +1027,8 @@ void ACesium3DTileset::LoadTileset() {

this->_startTime = std::chrono::high_resolution_clock::now();

this->LoadProgress = 0;

Cesium3DTilesSelection::TilesetOptions options;

options.enableOcclusionCulling =
Expand Down Expand Up @@ -2028,12 +2044,15 @@ void ACesium3DTileset::Tick(float DeltaTime) {
CreateViewStateFromViewParameters(camera, unrealWorldToCesiumTileset));
}

const Cesium3DTilesSelection::ViewUpdateResult& result =
this->_captureMovieMode
? this->_pTileset->updateViewOffline(frustums)
: this->_pTileset->updateView(frustums, DeltaTime);
Cesium3DTilesSelection::ViewUpdateResult result;
if (this->_captureMovieMode) {
TRACE_CPUPROFILER_EVENT_SCOPE(Cesium::updateViewOffline)
result = this->_pTileset->updateViewOffline(frustums);
csciguy8 marked this conversation as resolved.
Show resolved Hide resolved
} else {
TRACE_CPUPROFILER_EVENT_SCOPE(Cesium::updateView)
result = this->_pTileset->updateView(frustums, DeltaTime);
}
updateLastViewUpdateResultState(result);
this->UpdateLoadStatus();

removeCollisionForTiles(result.tilesFadingOut);

Expand Down Expand Up @@ -2066,6 +2085,8 @@ void ACesium3DTileset::Tick(float DeltaTime) {
updateTileFade(pTile, false);
}
}

this->UpdateLoadStatus();
}

void ACesium3DTileset::EndPlay(const EEndPlayReason::Type EndPlayReason) {
Expand Down
Loading