← Back to team overview

widelands-dev team mailing list archive

Re: [Merge] lp:~widelands-dev/widelands/use_image_cache into lp:widelands

 

This is an impressive improvement! Out of curiosity, how large does the texture atlas end up being?

I have mostly looked at the actual drawing code (I did notice a merge conflict marker in world.cc though, and bunnybot probably did, too ;)). As I already said today on IRC, for desktop OpenGL you should really use instancing. You can copy the Arguments structure almost 1:1 into a buffer with per-instance data and use a simple vertex shader to reduce the CPU load. I'd understand if you don't want separate paths for OpenGL ES 2.0 though.

The next-best thing I noticed is using glMapBuffer to avoid an unnecessary copy. Right now you fill the vertices_ vector (first "copy"). Then glBufferData copies the vertex data into the buffer object (second copy). And finally the vertex data gets uploaded to/pulled by the GPU.

Since you know the number of vertices up-front, it would be better to use glBufferData(..., NULL, ...) to reserve the required amount of space, then use glMapBuffer and write the vertex data directly into the buffer object, thus avoiding the second copy. (This would also apply to 

(And now that I've written this, I notice that GLES 2.0 apparently doesn't have glMapBuffer... *sigh*)

Third, it would almost certainly be beneficial to use an element array. That would reduce the amount of vertex data you need to write by one third, and would also reduce the vertex shader load on the GPU by up to one third (Widelands almost certainly isn't shader bound, but hey... perhaps it saves some power). The element array can even be completely static - it just needs to grow as needed during application warm-up.

For the buffer handling, the following is actually best for the kind of streaming draw that Widelands is doing:
1. Determine a reasonably large buffer size N.
2. Use a single buffer object, initialized with glBufferData(..., N, NULL, GL_STREAM_DRAW);
3. As you go along drawing a frame, fill the buffer object from front to back. For all vertex data that you write, use glMapBufferRange to map _exactly_ the range that you want to write for the next draw call, then unmap, then do the draw call.
4. When the vertex data for the next draw call wouldn't fit in the remaining space of the buffer, start from the front, but (and this is VERY important, I've seen plenty of applications screw this up) add the GL_MAP_INVALIDATE_BUFFER_BIT.

What happens when you do this is that the driver magically maintains a pool of underlying buffers, all of size N. Whenever you start from 0 with the MAP_INVALIDATE_BUFFER_BIT, the driver switches the underlying buffer to one from the pool that is no longer in-flight. As far as I know, all desktop GL drivers do this.

Now you see that the patch I sent isn't actually optimal. It would be even better to do something like this:

GLsizeiptr bytes = items.size() * sizeof(T);

if (bytes > size_) {
  glBufferData(GL_ARRAY_BUFFER, bytes, items.data(), GL_STREAM_DRAW);
  size_ = bytes;
  filled_ = bytes;
} else {
  GLbitfield access = GL_MAP_WRITE_BIT;

  if (filled_ + bytes > size_) {
    filled_ = 0;
    access |= GL_MAP_INVALIDATE_BUFFER_BIT;
  }

  void *map = glMapBufferRange(GL_ARRAY_BUFFER, filled_, bytes, access);
  memcpy(map, items.data(), bytes);
  glUnmapBuffer(GL_ARRAY_BUFFER);

  filled_ = (filled_ + bytes + 3) & ~3; // for the memcpy a larger alignment might be better; the GPU doesn't really care that much as long as you're dword aligned
}

This is better because then the underlying buffer is always the same size and the re-use mechanism can work more efficiently. But then you need to check for support of GL_ARB_map_buffer_range, and apparently GLES 2.0 doesn't have it. Also, buffers don't have to be the exact same size to be re-used by the driver, so the impact is probably not too bad.

And of course, while all of the above should reduce CPU load, I have no idea by how much.
-- 
https://code.launchpad.net/~widelands-dev/widelands/use_image_cache/+merge/282106
Your team Widelands Developers is requested to review the proposed merge of lp:~widelands-dev/widelands/use_image_cache into lp:widelands.


References