Merge lp://qastaging/~smspillaz/compiz/compiz.static_vertex_buffers into lp://qastaging/compiz/0.9.9

Proposed by Sam Spilsbury
Status: Work in progress
Proposed branch: lp://qastaging/~smspillaz/compiz/compiz.static_vertex_buffers
Merge into: lp://qastaging/compiz/0.9.9
Diff against target: 1098 lines (+765/-23) (has conflicts)
13 files modified
plugins/decor/src/decor.cpp (+8/-1)
plugins/decor/src/decor.h (+2/-0)
plugins/opengl/CMakeLists.txt (+13/-0)
plugins/opengl/include/opengl/opengl.h (+30/-0)
plugins/opengl/include/opengl/vertexbufferstack.h (+69/-0)
plugins/opengl/src/paint.cpp (+47/-13)
plugins/opengl/src/privates.h (+20/-2)
plugins/opengl/src/vertexbufferstack/CMakeLists.txt (+25/-0)
plugins/opengl/src/vertexbufferstack/include/vertexbufferstack-internal.h (+67/-0)
plugins/opengl/src/vertexbufferstack/src/vertex-buffer-stack.cpp (+145/-0)
plugins/opengl/src/vertexbufferstack/tests/CMakeLists.txt (+24/-0)
plugins/opengl/src/vertexbufferstack/tests/test-opengl-vertex-buffer-stack.cpp (+256/-0)
plugins/opengl/src/window.cpp (+59/-7)
Text conflict in plugins/opengl/CMakeLists.txt
To merge this branch: bzr merge lp://qastaging/~smspillaz/compiz/compiz.static_vertex_buffers
Reviewer Review Type Date Requested Status
Daniel van Vugt Needs Resubmitting
jenkins continuous-integration Pending
Review via email: mp+126879@code.qastaging.launchpad.net

This proposal supersedes a proposal from 2012-09-09.

Description of the change

Avoid doing so many vertex buffer object binds. This is more of an RFC than an actual merge proposal as I don't see it going in this cycle.

At the moment, all of the vertex buffer objects for windows are marked GL_STATIC_DRAW. However according to http://developer.apple.com/library/ios/#documentation/3DDrawing/Conceptual/OpenGLES_ProgrammingGuide/TechniquesforWorkingwithVertexData/TechniquesforWorkingwithVertexData.html , this is a bad idea because doing glBindBuffer and glBufferData on those buffers can cause GPU to copy vertex data from system memory to the GPU on for that vertex buffer every time the command queue is processed. If that's an expensive operation, then it could result in a fairly large slowdown on some platforms. Because the content of our vertex buffers can change a fair bit, I've changed the default usage hint to GL_DYNAMIC_DRAW, which appears to mean "dma data into the gpu, cache it where necessary"

In compiz, we do region updates by clipping vertex data to a clip region (and, eg, generating new vertex and tex-coord data where it makes sense to do so) and then paint only the primitives specified by those vertices. This is easier on GPUs where fill rate is a concern. However, it means that we can't have static data in our vertex buffer objects, and they effectively need to be re-uploaded every time they change.

There were two factors causing them to be changed all the time, which meant that they had to be re-uploaded all the time
 1) Plugins "share" the vertex buffer object with opengl, and clear the vertex data to add their own. This means that both sets have to be re-uploaded, and the opengl one has to be re-populated every time a plugin uses it "temporarily"
 2) A typical usage trend appears to be just calling ::begin (), adding vertex data and then calling ::end (), even if the vertex data is unchanged.

As for 1) This would best be served by breaking the API and making a GLVertexBuffer * a mandatory part of glAddGeometry and glDrawTexture, so that plugins can provide their own vertex buffers without clobbering the opengl one. Since we can't do that, I've added an API to manipulate which GLVertexBuffer the opengl plugin is using internally. This code is a separate module and covered by unit tests

As for 2) I've added a simple check to determine if we need to re-add vertex data at all, and if we don't, not to call glAddGeometry again, but to re-use the vertex buffer (where it makes sense to do). This code is untested, as it involves setting and checking flags and manipulating lots of internal state. Its also not the preferred solution either as per my comments above, which is why I haven't designed the code to be testable around this case.

The result is that for playing a simple video, in the old usecase, we called:
  * glAddGeometry 497 times
  * GLVertexBuffer::end () (eg, vertex data upload if there is vertex data, which there will be for every frame) 745 times

And for the new code:
 * glAddGeometry 13 times
 * GLVertexBuffer::end () 262 times

I'm not sure what kind of impact this has on framerates. In both cases I maxed out at 60fps.

Where to go with this?
 * The next place that is calling GLVertexBuffer::end () a lot is PrivateGLScreen::paintOutputs when drawing the fbo to the screen. This can be resolved by actually treating that like a static buffer as well (it really should be), or moving to glBlitFramebuffer (preferred
 * We need to make the decor plugin implement the same logic in 2)
 * Make the other plugins provide their own vertex buffers where they need to "attach" textures to windows rather than piggybacking the one internal to opengl.

tl;dr; This might make compiz faster at watching youtube videos. Need feedback as to whether to continue in this direction.

To post a comment you must log in.
Revision history for this message
jenkins (martin-mrazik+qa) wrote : Posted in a previous version of this proposal
review: Approve (continuous-integration)
Revision history for this message
Daniel van Vugt (vanvugt) wrote : Posted in a previous version of this proposal

Argh, so many words.

Initial observation: If a class is called "VertexBufferStack" then its method names should not contain the redundant text "VertexBuffer".

Revision history for this message
Sam Spilsbury (smspillaz) wrote : Posted in a previous version of this proposal

This doesn't need review right away. Just seeking feedback as to whether or not this is the right direction to go

Revision history for this message
Daniel van Vugt (vanvugt) wrote : Posted in a previous version of this proposal

Awesome speedup on intel. This is 15-25% faster than trunk there. And it matches the max performance levels I have ever seen on my my desktop.

I honestly never thought I would see the FBO+VBO codepath perform this well... I'd like to make this enhancement a priority.

Revision history for this message
Daniel van Vugt (vanvugt) wrote : Posted in a previous version of this proposal

Linked to bug 1024304, because I suspect this might be the answer. It certainly is for Intel Sandy Bridge.

Revision history for this message
Daniel van Vugt (vanvugt) wrote : Posted in a previous version of this proposal

This is weird. Suddenly trunk is as fast. Maybe I got prematurely excited but it's still interesting.

Revision history for this message
Daniel van Vugt (vanvugt) wrote : Posted in a previous version of this proposal

The speedup I thought I got from this was a false alarm. My whole system is suddenly faster with all branches. It wasn't anything to do with this one.

As far as I can tell, this is not required for quantal. So please repropose it after we've branched for 0.9.9.

review: Needs Resubmitting
Revision history for this message
Sam Spilsbury (smspillaz) wrote :

WIP. We can manipulate the internal vb in a much cleaner way.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Please target all fixes to lp:compiz first (0.9.10). And then lp:compiz/0.9.9 (the new raring branch) second.

review: Needs Resubmitting
Revision history for this message
MC Return (mc-return) wrote :

Has this been forgotten ?

Revision history for this message
Sam Spilsbury (smspillaz) wrote :

It was more or less of a proof of concept. I'm not interested in
implementing it now.

On 28/05/2013 2:54 AM, "MC Return" <email address hidden> wrote:
>
> Has this been forgotten ?
> --
>
https://code.launchpad.net/~smspillaz/compiz/compiz.static_vertex_buffers/+merge/126879
> You are the owner of lp:~smspillaz/compiz/compiz.static_vertex_buffers.

Unmerged revisions

3363. By Sam Spilsbury

Don't call glAddGeometry if the geometry is
going to be exactly the same. Saves a vertex
buffer bind

3362. By Sam Spilsbury

Only remove the last vertex buffer when using the stack

3361. By Sam Spilsbury

Use GL_DYNAMIC_DRAW for the vertex buffer objects

3360. By Sam Spilsbury

Use a foreign vertex buffer to draw the decorations

3359. By Sam Spilsbury

Implement the interface publicly

3358. By Sam Spilsbury

Implement the internal vertex buffer in terms of the vertex buffer stack

3357. By Sam Spilsbury

Added a simple RAII class to manage that

3356. By Sam Spilsbury

Added a new class, VertexBufferStack.

A VertexBufferStack is a stack of saved GLVertexBufferObjects, which can have
items pushed to the back, removed and popped off the top. activeVertexBuffer
always represents the top most vertex buffer on the stack, popVertexBufferOffTop
will restore the last one.

3355. By Sam Spilsbury

Internally added the concepts of multiple vertex buffers per window

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
The diff is not available at this time. You can reload the page or download it.

Subscribers

People subscribed via source and target branches