Skip to content
This repository has been archived by the owner on Aug 8, 2023. It is now read-only.

raster performance improvements #114

Closed
wants to merge 8 commits into from
Closed

raster performance improvements #114

wants to merge 8 commits into from

Conversation

incanus
Copy link
Contributor

@incanus incanus commented Mar 25, 2014

Here is one possible improvement towards #103, though it's not done yet.

It moves from a std::forward_list for tiles to a std::map, binned by zoom level and constantly sorted (with custom comparator std::greater) per normal map behavior. Could try std::unordered_map but the sort here is negligible as it's among ~20 keys, tops (zoom levels).

Pro:

  • This makes things faster, yet still not fast enough on retina iPad.

Con:

  • Tile iteration logic is a bit more complex, though still decent.

@kkaefer @springmeyer I feel like I'm hitting the limits of my C++ knowledge here and spinning my wheels so I would appreciate some eyes. @springmeyer I also tried your suggestion of std::find(tiles.begin(),tiles.end(),...), which yielded a slight improvement but not as much as the move to map.

I'm going to work on some raster binding improvements now, which is showing as a higher bottleneck now that these improvements are in place, as well as grabbing #105.

@kkaefer
Copy link
Member

kkaefer commented Mar 26, 2014

Under what circumstances do you see the parent/children finding becoming an issue? This never appears in my performance profiles.

@kkaefer
Copy link
Member

kkaefer commented Mar 26, 2014

Why not stick all into a std::map<Tile::ID, Tile>?

@incanus
Copy link
Contributor Author

incanus commented Mar 26, 2014

Under what circumstances do you see the parent/children finding becoming an issue? This never appears in my performance profiles.

No, this looks great for me on everything except a retina iPad 3 fullscreen. The combination of other things going on to drive the display at larger size + so many textures is what causes issues, but we need to solve for this hardware. It's just not a matter of acceptable delay, either, it affects frame rate and gesture responsiveness.

@incanus
Copy link
Contributor Author

incanus commented Mar 26, 2014

Why not stick all into a std::map<Tile::ID, Tile>?

The Tile::ID is what is being compared anyway; I don't think this would help. Currently all of tiles are iterated, checking id against the needed tile. This would do the same thing.

The map approach at least bins them by zoom level.

@incanus
Copy link
Contributor Author

incanus commented Mar 26, 2014

One thing I am going to try next is to pre-set packed integers in the tile constructor, rather than on demand, and use those for the tile.id == id comparison to reduce three comparisons (z/x/y) down to one.

@incanus
Copy link
Contributor Author

incanus commented Mar 27, 2014

Under what circumstances do you see the parent/children finding becoming an issue? This never appears in my performance profiles.

@kkaefer To give you a sense, I just made two videos with Reflector. The frame rate loss you see in the iPad is real, not the video.

iPhone: https://dl.dropboxusercontent.com/u/575564/iphone_raster.mov
iPad: https://dl.dropboxusercontent.com/u/575564/ipad_raster.mov

And here's a representative profiling:

profile_update_tiles

Here updateTiles takes longer than the actual render calls

profile_find_loaded_children

And breaking it down, findLoadedChildren ranks 1 and findLoadedParent ranks 3

A CSV export of the profile (Instruments does this?): https://dl.dropboxusercontent.com/u/575564/Time%20Profiler%20-%20llmr_Run1.csv

@incanus
Copy link
Contributor Author

incanus commented Mar 28, 2014

My latest work here is on somehow seeing if HTTP connections need to be throttled. They can max out at 25-35 simultaneous connections, depending, and although the NSURLSession system is supposed to handle them appropriate to system load, that's still a lot, so it might be forcing the main thread / tile structure updating into reduced resources.

@incanus
Copy link
Contributor Author

incanus commented Mar 28, 2014

URL loads are not a bottleneck problem. I worked with local, cached tiles and wasn't able to get satisfactory performance, either.

Now I'm thinking that despite updateTiles() frequently ranking above other things (like painting) in time profiling, it might not be what's causing the perceived performance problems. I started looking for intense OpenGL calls, since all of them are on the main thread. The heaviest are all of the glTexImage2D() to upload raster textures to the GPU.

I experimented with not actually uploading textures and the performance is a lot better.

These uploads could be offloaded to a second OpenGL context and a background thread instead of being kept in the render loop, so that's my next approach.

@incanus
Copy link
Contributor Author

incanus commented Mar 31, 2014

Pausing this for a bit to get my head out of raster for a while, as this works well on smaller devices now.

@kkaefer
Copy link
Member

kkaefer commented Apr 1, 2014

I'm going to have a look at this as part of #101

@kkaefer
Copy link
Member

kkaefer commented May 12, 2014

Related tickets: #183, #122

@incanus
Copy link
Contributor Author

incanus commented Jul 1, 2014

Closing this as stale.

It moves from a std::forward_list for tiles to a std::map

Among other things, we do this already.

@incanus incanus closed this Jul 1, 2014
@jfirebaugh jfirebaugh deleted the raster-performance branch August 20, 2014 20:56
mikemorris added a commit that referenced this pull request Aug 21, 2015
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
performance Speed, stability, CPU usage, memory usage, or power usage
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants