[3.x] Implement adaptive multibuffer 2D rendering option #65385

kisg · 2022-09-05T22:44:35Z

The current 2D rendering algorithm assumes that the OpenGL driver
supports buffer orphaning. However, this is not always the case, e.g.
the Oculus Quest 2 does not.

This change adds an option to switch to adaptive multibuffer rendering
where a separate buffer is used for each rendering batch to avoid
implicit synchronization. The number of buffers is set adaptively based
on the number of batches required for rendering the previous frame.

Companion proposal: godotengine/godot-proposals#5348

The current 2D rendering algorithm assumes that the OpenGL driver supports buffer orphaning. However, this is not always the case, e.g. the Oculus Quest 2 does not. This change adds an option to switch to adaptive multibuffer rendering where a separate buffer is used for each rendering batch to avoid implicit synchronization. The number of buffers is set adaptively based on the number of batches required for rendering the previous frame.

lawnjelly · 2022-09-06T06:22:34Z

drivers/gles2/rasterizer_canvas_gles2.cpp

 		// pre fill index buffer, the indices never need to change so can be static
-		glGenBuffers(1, &bdata.gl_index_buffer);
-		glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, bdata.gl_index_buffer);
+		glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, index_buffer);


There's no need to do this when using multiple buffers I think. The index buffer can be reused, only the vertex buffer needs multiple versions.

lawnjelly · 2022-09-06T06:26:15Z

drivers/gles2/rasterizer_gles2.cpp

 	}
+
+	bool multiple_buffer_batching = GLOBAL_GET("rendering/batching/options/multiple_buffer_batching");
+	if (multiple_buffer_batching) {


GLOBAL_GET should not be called every frame, you can load in constructor / initialize and keep a member variable.

lawnjelly · 2022-09-06T06:31:19Z

drivers/gles3/rasterizer_canvas_gles3.h

+				hysteresis = (float)new_size;
+			}
+			resize(new_size);
+		}


Why is there hysteresis in GLES3 but not in GLES2?

We will add hysteresis for GLES2 as well.

lawnjelly · 2022-09-06T06:50:31Z

Overall imo it's not bad at all in terms of implementation, but it would be good to test it with some input data that won't batch.

If you feed it a pathological situation with e.g. lights / custom shaders / text it could potentially create thousands of buffers, and go horribly wrong. This would be well worth trying out, and it may be worth having an upper limit to number of buffers that can be created, just in case. To be fair, afaik orphaning also effectively creates thousands of buffers in play but it may not be the same implementation wise in the driver etc.

That said once tidied up it could potentially be used as a stepping stone to a two-pass implementation as I described in the proposal. 🙂

If we can get the other version working, it would likely effectively replace this (as the less the buffer uploads, the better), and we could probably remove this version. That said, I wouldn't necessarily want to hold this PR up though, because it's difficult to give timescales on the 2 pass approach as I have a lot of other work on currently.

Often the official policy is not to merge stuff if it is likely to be replaced but personally I don't have a problem with potential stop gaps, especially if they are easy to untangle as this is. These are all imo of course, I'm sure @akien-mga and @clayjohn will have opinions on whether a good idea. 😁

kisg · 2022-09-06T10:30:58Z

@lawnjelly Thank you very much for your review and your proposal in the proposal :) . We will definitely add an upper bound to the number of buffers to this PR and will try to collect some hard data with different test cases. Is there a test suite / test projects that you usually use for testing 2D drawing performance?

clayjohn · 2022-09-07T05:54:21Z

In 4.0 I have tried two different approaches (mind you we do things a little different in 4.0 and barely use any vertex buffers, most data is passed in packed UBOs). The first approach was to give each batch its own UBO from an ever increasing pool of UBOs. I used a fence to check whether a UBO was still in use, If I came back around to the beginning of my circular buffer of UBOs and the last UBO was still in use I would allocate a new one and insert it. In the end I never had more than a few hundred UBOs at once. This approach worked really well on newer hardware, but fell short on older hardware and mobile.

The approach I have tested, but still haven't merged is to allocate a few giant UBOs (at least one per frame), then run through draw commands until I have enough data to fill the UBO (or I run out of commands) and I record batch index start and end positions within the UBO. Then, fill the UBO and render the batches. If I filled that UBO, I move to another large UBO rather than orphaning and starting again. This is essentially what @lawnjelly describes as the "two-pass" approach. I found it was about 2-3 times as fast as the above approach on lower-end hardware and just as fast on newer hardware.

For the 3.x branch, I think the gains will be similar despite the fact that we would be allocating a large VBO instead of a UBO

kisg requested a review from a team as a code owner September 5, 2022 22:44

kisg mentioned this pull request Sep 5, 2022

Optimization of the 2D batching renderer for OpenGL implementations without orphaning godotengine/godot-proposals#5348

Open

Calinou added enhancement topic:rendering topic:2d performance labels Sep 5, 2022

Calinou added this to the 3.6 milestone Sep 5, 2022

lawnjelly reviewed Sep 6, 2022

View reviewed changes

lawnjelly modified the milestones: 3.6, 3.7 Sep 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

[3.x] Implement adaptive multibuffer 2D rendering option #65385

[3.x] Implement adaptive multibuffer 2D rendering option #65385

kisg commented Sep 5, 2022 •

edited

Loading

Uh oh!

lawnjelly Sep 6, 2022

Uh oh!

lawnjelly Sep 6, 2022

Uh oh!

lawnjelly Sep 6, 2022

Uh oh!

kisg Sep 6, 2022

Uh oh!

lawnjelly commented Sep 6, 2022

Uh oh!

kisg commented Sep 6, 2022

Uh oh!

clayjohn commented Sep 7, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

Uh oh!

[3.x] Implement adaptive multibuffer 2D rendering option #65385

Are you sure you want to change the base?

[3.x] Implement adaptive multibuffer 2D rendering option #65385

Conversation

kisg commented Sep 5, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lawnjelly Sep 6, 2022

Choose a reason for hiding this comment

Uh oh!

lawnjelly Sep 6, 2022

Choose a reason for hiding this comment

Uh oh!

lawnjelly Sep 6, 2022

Choose a reason for hiding this comment

Uh oh!

kisg Sep 6, 2022

Choose a reason for hiding this comment

Uh oh!

lawnjelly commented Sep 6, 2022

Uh oh!

kisg commented Sep 6, 2022

Uh oh!

clayjohn commented Sep 7, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

kisg commented Sep 5, 2022 •

edited

Loading