Skip to content
This repository has been archived by the owner on Aug 8, 2023. It is now read-only.

Introduce pause/resume API on default file source #8125

Merged
merged 2 commits into from
Mar 1, 2017

Conversation

boundsj
Copy link
Contributor

@boundsj boundsj commented Feb 20, 2017

Implementation proposal for #5236 to address #4841

This adds an API to default file source so that it can be put in a state where requests are not checked for revalidation in sqlite (or sent over the network). The iOS map view has also been updated to use this new API when it knows that the host application is going into the background or will otherwise be in a state where it is not visible by the user.

This optimization means that the SDK will not make requests for data that the user cannot see or hit edge cases on some OSs where resources (i.e. sqlite) are not available when the host app is not visible to the user.

I did not go as far down as the suggestion in #5236 (comment). I think some notion of this API on the file source will still make sense even if we do add a similar API at the thread level.

To test this on an iOS device, I ran a primary test app with these changes in the background and then a secondary app so that I could simulate location changes in Xcode. Before the changes, if the primary app was launched in the background due to location changes, there were sqliite revalidation hits. This does not happen after these changes.

This proposed implementation keeps a file source "active" by default so no platform changes are required and it is completely backwards compatible. However, the Android platform might be able to take advantage of the new API in the MapView activity lifecycle methods.

cc @jfirebaugh @1ec5 @tmpsantos

@boundsj boundsj added Core The cross-platform C++ core, aka mbgl crash iOS Mapbox Maps SDK for iOS labels Feb 20, 2017
@boundsj boundsj added this to the ios-v3.5.0 milestone Feb 20, 2017
@boundsj boundsj self-assigned this Feb 20, 2017
@mention-bot
Copy link

@boundsj, thanks for your PR! By analyzing this pull request, we identified @1ec5, @incanus and @jfirebaugh to be potential reviewers.

@boundsj boundsj force-pushed the boundsj-add-pause-resume branch from ce06fa7 to 8f3abad Compare February 20, 2017 20:57
@@ -1070,6 +1070,8 @@ - (void)sleepGL:(__unused NSNotification *)notification
{
self.dormant = YES;

[MGLOfflineStorage sharedOfflineStorage].mbglFileSource->setRevalidationState(mbgl::DefaultFileSourceRevalidationState::Inactive);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since there may be multiple MGLMapViews but only one MGLOfflineStorage, can we make MGLOfflineStorage observe these notifications to set its own revalidation state?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would also allow iOS and macOS to share the same implementation. For example, the iOS implementation of MGLMapView calls -sleepGL: in response to UIApplicationDidEnterBackgroundNotification, which probably corresponds to NSApplicationDidHideNotification and/or NSApplicationDidChangeOcclusionStateNotification on the Mac.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I'll come back to this once the core implementation is done.

@jfirebaugh
Copy link
Contributor

Adding DefaultFileSource::setRevalidationState is a start, but this implementation doesn't go deep enough into the FileSource stack to pause revalidation and thus fix what's likely the underlying cause of crashes of the type seen in #4841 and #6205. Revalidation per cache-control headers is driven by timers in OnlineFileRequest::schedule, not by calls to FileSource::request.

@tmpsantos
Copy link
Contributor

@jfirebaugh is right. The tiles we already see loaded/rendered on the screen will re-request when they expire.

I keep my suggestion on making a low level Thread::pause so we could pause on a thread synchronization primitive instead of sleeping on the main loop. This has the nice side effect of when we do Thread::unpause and go back to the main loop, the expired timers will be processed automatically.

@boundsj
Copy link
Contributor Author

boundsj commented Feb 21, 2017

Thanks @jfirebaugh @tmpsantos. I was focused on the case I was seeing while testing where the initial style request that happens at initialization when a host iOS app is launched in the background triggers the interaction with sqlite in DefaultFileSource.

I do think it is possible that some/many of the SqlLite disk I/O error errors that have been reported are because of the db calls in that scenario. The scheduled network calls in the timer may happen in iOS applications that are running for several seconds in the background from time to time (in response to push notifications, location updates, etc.), but I'm seeing the DefaultFileSource sqlite interactions happen every time an app with a Mapbox map is launched.

I do agree we should solve for the schedule case too. I will look again at the Thread::pause approach that has been proposed.

@jfirebaugh
Copy link
Contributor

jfirebaugh commented Feb 21, 2017

I'm not sure that a Thread::pause approach will work. That would mean that if you pause the FileSource thread, and then call methods like DefaultFileSource::setAccessToken that use invokeSync, the application will deadlock.

@tmpsantos
Copy link
Contributor

DefaultFileSource::setAccessToken

invokeSync could unpause the Thread if this use case is supported. DefaultFileSource::setAccessToken in fact could by async since the messages to the thread are ordered.

@boundsj
Copy link
Contributor Author

boundsj commented Feb 22, 2017

@jfirebaugh @tmpsantos as a checkpoint, I committed 60bc871 that removes references to tasks when the revalidation state is set to inactive. This removes tasks for any loaded tiles that were scheduled and avoids the revalidation noted in #8125 (comment).

I also committed a311ebe that saves the objects required to make the tasks that 60bc871 removes. Those objects are re-requested if the revalidation state is set to active so that tile expiry still works as before.

Removing all of this in favor of simply pausing the thread would be nice. I'll keep investigating that since I'm still not very familiar with the thread / run loop / scheduler implementation. If it turns out there are issues with pausing the thread, maybe we can fall back to something like 60bc871 and a311ebe?

@tmpsantos
Copy link
Contributor

I wrote a prototype of what I had in mind:
https://github.com/mapbox/mapbox-gl-native/tree/tmpsantos-pause_resume

To test with the GLFW app, just press P for pause/resume the FileSource thread.

@boundsj
Copy link
Contributor Author

boundsj commented Feb 22, 2017

Thanks so much @tmpsantos for the example #8125 (comment) with C++ promises and futures. It seems to work well.

@jfirebaugh is correct that this approach does lock the calling (main) thread when other paths that use the file source thread's invokeSync are called after the thread is stopped to wait for the resume promise to be set.

However, since we expect that pause would only be called by our iOS (and maybe Android) platform implementations when an application is in the background and not visible to the user then this may not actually matter most of the time. Also the file source APIs that would cause a deadlock are probably very unlikely to ever be called when the file source thread is paused in the background.

Still though, in our iOS SDK, there is nothing stopping a developer from calling platform methods that, in turn, call DefaultFileSource::setOfflineMapboxTileCountLimit, DefaultFileSource::getAccessToken, DefaultFileSource::setAccessToken, DefaultFileSource::getAPIBaseURL, and DefaultFileSource::setAPIBaseURL when the application is running in the background. An iOS app developer who expects their application to be able to set or read one of those values in the background and then continue to do work would be disappointed if we change the thread behavior as proposed here.

I think the most common scenario (and it probably would not be that common) would be for the app developer to set / change the base URL and access token in the background after the file source thread is paused. In 3a682d8, I went ahead and made these operations async (as discussed in #8125 (comment)) to mitigate this part of the deadlock concern. For the setters, I don't think that temporarily resuming the thread in the background would be viable since it may lead to processing of any expired tasks and revalidation (and the potential sqlite crash), right?

Do others think that this approach with the modifications to Thread is the way we should go? Because the implementation is so much cleaner (and I can't think of a way other than something like a queue that I described in #8125 (comment)), I think we should go with the Thread::pause approach. I'm not sure if this means we would need to document that calling certain APIs (I think it would probably just end up being getAccessToken and getAPIBaseURL) in the background can lock the main thread.

cc @jfirebaugh @1ec5

@tmpsantos
Copy link
Contributor

I think it would probably just end up being getAccessToken and getAPIBaseURL) in the background can lock the main thread.

You can still make this work. You could cache in the DefaultFileSource the values of the AccessToken and BaseURL when you set them (an initialize with the same default as you initialize them in the main thread). So when you get, you don't need to fetch it from the DefaultFileSource thread.

@boundsj boundsj force-pushed the boundsj-add-pause-resume branch from aee175a to cc0fc71 Compare February 23, 2017 00:38
@boundsj boundsj changed the title Introduce revalidation state API on default file source Introduce pause/resume API on default file source Feb 23, 2017
@@ -1070,6 +1070,8 @@ - (void)sleepGL:(__unused NSNotification *)notification
{
self.dormant = YES;

[MGLOfflineStorage sharedOfflineStorage].mbglFileSource->pause();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a reminder to move this code into MGLOfflineStorage itself if possible: #8125 (comment).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is done in c11da3a

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or rather, 57a6b0f

@@ -54,6 +54,32 @@ class Thread {
return future.get();
}

void pause() {
assert(!paused);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need an #include <cassert> in this file.

@boundsj boundsj force-pushed the boundsj-add-pause-resume branch 2 times, most recently from 4b1b451 to 5687497 Compare February 23, 2017 19:21
@tmpsantos
Copy link
Contributor

Please, do not forget to add the utests for Thread::pause/resume. I can help if needed.

@boundsj boundsj force-pushed the boundsj-add-pause-resume branch from 57a6b0f to dc57cc2 Compare February 24, 2017 05:27
requests.push_back(unpausedThread.invokeWithCallback(&TestObject::transferOut, [&] (std::unique_ptr<int> result) {
EXPECT_EQ(tid, std::this_thread::get_id());
EXPECT_EQ(*result, 1);
thread.resume();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thread::pause and Thread::resume are not thread safe and should only be called on thread that owns the Thread<Object>. We should add a MBGL_VERIFY_THREAD(tid) to enforce this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wrote a few more unit tests on my branch, turned out I found some bugs. The thread will hang if it gets destroyed when paused. I fixed that and made a PR. What about we land #8194 and you rebase your branch on top?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good @tmpsantos

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tmpsantos I just pushed a wip commit here that duplicates some of what you did in #8194 so that I could make sure I understood what you meant about calling resume on a different thread than the one that owns the thread object. I see now that my test still passes because the work task impl packageArgumentsAndCallback calls the lambdas such that, in my test, the callback is called on the calling (main in this case) thread so the MBGL_VERIFY_THREAD check passes.

Anyway, #8194 looks good. I will rebase this on top of that once it is merged. Thanks again.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My apologies, I was thinking of Thread::invoke. You are right, the callback in this case will fire on the thread that invoked the method.

Copy link
Contributor

@tmpsantos tmpsantos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will hang if thread gets destroyed while paused.

@boundsj boundsj force-pushed the boundsj-add-pause-resume branch from aaab24c to 7ea302e Compare February 27, 2017 20:04
@boundsj
Copy link
Contributor Author

boundsj commented Feb 27, 2017

@tmpsantos I've rebased the changes in this PR to include your work in #8194. Can you please take another look at this when you are able?

Copy link
Contributor

@tmpsantos tmpsantos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 for the core parts, just initialize the cachedBaseURL attribute.

@@ -127,6 +143,8 @@ class DefaultFileSource : public FileSource {
const std::unique_ptr<util::Thread<Impl>> thread;
const std::unique_ptr<FileSource> assetFileSource;
const std::unique_ptr<FileSource> localFileSource;
std::string cachedBaseURL;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably need to initialize this one with mbgl::util::API_BASE_URL, otherwise if you call the getter before setting, the result will be inconsistent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I rewrote e277b6e to add this initialization.

This caches the base URL and access token values when they are set
so that they can still be retrieved even when the thread is paused.
This uses the pause/resume API on the default file source to pause
network and revalidation activity when the host iOS application goes
into the background. Activity is resumed when the host application goes
into the foreground.

The intention of this change is to avoid edge cases on some OSs where
resources (i.e. sqlite) are not available when the host app is not
visible to the user.
@boundsj boundsj force-pushed the boundsj-add-pause-resume branch from 7ea302e to 8e88c31 Compare February 28, 2017 23:16
* If pause is called then no revalidation or network request activity
* will occur.
*
* Note: Calling pause and then calling getAPIBaseURL or getAccessToken
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is no longer valid.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fyi this was fixed in #8465 @tmpsantos

@boundsj
Copy link
Contributor Author

boundsj commented Mar 2, 2017

Noting that this solution ended up being sort of specific to the iOS platform because it completely shuts down the Mapbox iOS SDK's map related file system and network activity in core when the host application goes into the background. This will need to be revisited when #4291 is addressed.

This solution is not yet granular enough for issues like #7443 or to be used on Android where things like background transfers of tiles for offline use are already supported. For now, this is probably ok since the sqlite crashes this solution intends to address are specific to the iOS platform.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Core The cross-platform C++ core, aka mbgl crash iOS Mapbox Maps SDK for iOS
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants