Skip to content

Broadcom Videocore IV Performance Recommendations

Yours3lf edited this page Jun 18, 2020 · 5 revisions

Profiling using hardware counters

Profiling can be done using standard Vulkan performance queries. See the query.cpp example.

Coordinate shaders

The Broadcom Videocore IV GPU has a hidden shader stage called the Coordinate Shader stage. This stage merely computes the final vertex positions. This makes sure that the GPU doesn't process vertex attributes for vertices that would be culled/clipped anyways. Therefore it is advised to supply vertex positions in a separate buffer so that the Coordinate Shader stage can achieve high cache efficiency. The rest of the vertex attributes should be located interleaved in a separate buffer.

Index buffers

Indexing can be used to make sure vertices are not processed redundantly. An index buffer optimizer library such as meshoptimizer should be used to make sure index buffers achieve maximum cache efficiency. See https://github.com/zeux/meshoptimizer

Vertex buffers

Choosing lower precision vertex attributes (8bit, 16bit) can save significant bandwidth, so choose a precision that suits your meshes. Triangles that cover very few pixels (think less than 32) will be rasterized very inefficiently. Please make sure your vertices cover large enough screen area.

Tile based architecture

The Broadcom Videocore IV GPU is a tile based GPU (but not deferred) therefore it's important to sort your geometry front-to-back to avoid any unnecessary overdraw.

ALU architecture

The Broadcom Videocore IV GPU has a dual-issue scalar FP32 ALU. This means that it can execute up to two instructions per cycle using its ADD and MUL ALUs. To maximize utilization it's important to fully saturate both ADD and MUL pipelines.

Resolution

The Broadcom Videocore IV GPU is not really suited for 1080p resolution, therefore it's advisable to run at 720p to make sure the GPU is not overwhelmed with fragment work. This leads to a more balanced Vertex/Fragment workload and also a more balanced CPU/GPU workload.

Clears

Use Load/Store operations to clear your textures. Any other method will likely result in a full-screen quad to clear parts or all of a texture.

Clone this wiki locally