Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Address profile data loss #590

Merged
merged 18 commits into from
Jul 4, 2022
Merged

Address profile data loss #590

merged 18 commits into from
Jul 4, 2022

Conversation

Integralist
Copy link
Collaborator

@Integralist Integralist commented Jul 1, 2022

UPDATE: This PR now only adds extra mutexes around profile data to try and resolve point 1 and not point 2 because the flock implementation doesn't work on Windows and I was unable to setup a local Windows environment to test against (the use case is also not common, and the bigger concern for us at the moment is the first scenario).


Users have reported their profile data is unexpectedly deleted (non-deterministic, thus hard to reproduce).

There are two scenarios to handle:

  1. User has a single instance of the CLI running.
  2. User has multiple instances of the CLI running.

The first scenario has been hard to debug because we expected it to be solved by using a data synchronisation primitive (e.g. a mutex) around the relevant sections of code touching the configuration data. The CLI already implemented mutex synchronisation, but we were still getting reports of data loss and subsequently, I noticed areas of improvement and places we hadn't used mutexes where maybe we should. So, in this PR I've added more mutex usage to take a 'belts and braces' approach.

The latter scenario is easier to understand why a conflict might occur, and that's because multiple instances of the CLI are potentially attempting to update a single file, and mutexes are only exclusive with regard to a single process instance. To resolve the issue of multiple CLI instances running at once we require an OS-level file lock (aka flock) to be applied to the config file whenever interacting with it. The caveat to using flock is that it can be very inconsistent across platforms. So hopefully there aren't too many repercussions from doing this. Internal discussions have indicated that to solve this scenario we probably do need flock but to treat it as a 'best effort' approach and not a completely solved problem.

Additionally, and I've not attempted to address it in this PR, I want to make mention of our use of a static config file that we use as a 'fallback' when there are io/network issues (as the CLI requires there to be a config for it be usable). Obviously, this fallback config has no profile data inside of it, just enough config to allow the CLI to be usable. But currently when that data is being used it's also being persisted back to disk and it may well be in those situations the user is seeing their profile data being lost. I'm not completely convinced that's the scenario users have found themselves in, so far, because you would get a prominent and visible interactive prompt telling you a fallback config was being used, and user feedback has not indicated this to be the case. But it's worth understanding that the static config fallback introduces its own potential problems. One way to maybe work around it is to not persist the data back to disk when using the static fallback, the complication is if the user is running a command like fastly compute publish and so we have a Service ID that would otherwise be lost if it wasn't persisted to disk, and it's unclear at this point how that scenario should be handled.

@Integralist Integralist added the bug Something isn't working label Jul 1, 2022
@Integralist Integralist requested review from JakeChampion, a team and triblondon and removed request for a team July 1, 2022 13:20
# make build GO_ARGS='--ldflags "-s -w"'
GO_ARGS ?= ""
# Allows overriding go executable.
GO_BIN ?= go
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI this lets me validate changes across different versions of go.

@Integralist Integralist force-pushed the integralist/file-lock branch from 46a77c0 to f6cd3cd Compare July 4, 2022 10:13
@Integralist Integralist merged commit 0ab53dd into main Jul 4, 2022
@Integralist Integralist deleted the integralist/file-lock branch July 4, 2022 10:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants