Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate old savepoints #6068

Merged
merged 10 commits into from
May 17, 2024
Merged

Conversation

grzesiek2010
Copy link
Member

@grzesiek2010 grzesiek2010 commented Apr 4, 2024

Closes #5997
Blocked by #6049

Why is this the best possible solution? Were any other approaches considered?

#5951 has introduced a new way of handling savepoints that relies on a database. In the older version savepoints were created as files and detected by file names. Now they are still saved as files in the same way but the mechanism of finding savepoints has been improved using the database mentioned above. Because of that, we needed to import old savepoints to our new database.

How does this change affect users? Describe intentional changes to behavior and behavior that could have accidentally been affected by code changes. In other words, what are the regression risks?

To make sure importing old savepoints works well you need to:

  • Install an APK with the old savepoints implementation eg v2024.1.3
  • Create some savepoints for blank and saved forms
  • Upgrade your app to the version with these changes
  • Make sure the old correct savepoints are available when you open the forms (blank and saved ones)

This might be tricky so please be careful. Reading the tests I have added (at least the titles) should help you understand the complexity: https://github.com/getodk/collect/pull/6068/files#diff-88b7def45e9dbbc62a204360e82832c2704e8b542d114ad5c1fba08ac145754aR25

Please also make sure that newly created savepoints (those created with the app after upgrading) work well too.

Do we need any specific form for testing your changes? If so, please attach one.

No.

Does this change require updates to documentation? If so, please file an issue here and include the link below.

No.

Before submitting this PR, please make sure you have:

  • added or modified tests for any new or changed behavior
  • run ./gradlew connectedAndroidTest (or ./gradlew testLab) and confirmed all checks still pass
  • added a comment above any new strings describing it for translators
  • verified that any code or assets from external sources are properly credited in comments and/or in the about file.
  • verified that any new UI elements use theme colors. UI Components Style guidelines

@grzesiek2010 grzesiek2010 force-pushed the COLLECT-5997_1 branch 3 times, most recently from 926ddde to 7f95611 Compare April 8, 2024 07:04
@grzesiek2010 grzesiek2010 requested a review from seadowg April 9, 2024 23:25
@grzesiek2010 grzesiek2010 marked this pull request as ready for review April 9, 2024 23:25
val formFileName = File(form.formFilePath).name.substringBeforeLast(".xml")

cacheDir.listFiles { file ->
file.name.startsWith("${formFileName}_") && file.name.endsWith(".xml.save")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's easy to import savepoints that belong to saved forms because their names match instanceFilePaths. When it comes to blank forms it's a little bit cumbersome. We need to find savepoints with names starting with formFilePaths (we don't know the date of creation of a savepoint that might belong to our blank form which is a part of the savepoint file name so we find files based on their prefix and suffix).

}

private fun importSavepointsThatBelongToBlankForms(formsRepository: FormsRepository, savepointsRepository: SavepointsRepository, cacheDir: File, instancesDir: File) {
formsRepository.all.sortedByDescending { form -> form.date }.forEach { form ->
Copy link
Member Author

@grzesiek2010 grzesiek2010 Apr 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I sort the forms here so that we start looking for savepoints for the newest ones. Thanks to that newer versions of the same forms will be taken into account first which is important because newer forms usually have the same name like the first one but we add _1, _2, etc. suffixes:
form.xml (v1)
form_1.xml (v2)
Savepoints for such forms would be named:
form_2024-04-10_01-35-41.xml.save
form_1_2024-04-10_01-45-41.xml.save
If we start looking for savepoints for older form versions first we could end up with a wrong savepoint loaded. For example, if we started with form.xml (v1) we would look for a file with a name starting with form_ then both savepoints form_2024-04-10_01-35-41.xml.save and form_1_2024-04-10_01-45-41.xml.save would match and the one that belongs to the newer version of this form could be used. If we start with the newer one we look for a file with a name starting with form_1_. In this case only one file matches and we are safe (because below we check if this file was already loaded to the database).
`

}?.forEach { savepointFile ->
if (savepointFile.lastModified() > form.date) {
val alreadyUsed = savepointsRepository.getAll().any { savepoint -> savepoint.savepointFilePath == savepointFile.absolutePath }
if (!alreadyUsed) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this possible? I don't see a test for this case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test will fail ifThereAreMultipleVersionsOfTheSameBlankFormWithSavepoints_allSavepointsShouldBeImported and it's related to what I said here #6068 (comment)
Imaging you have two versions of the same form:
form.xml (v1) with a savepoint form_2024-04-10_01-35-41.xml.save
form_1.xml (v2) with a savepoint form_1_2024-04-10_01-45-41.xml.save

during importing it will look like:

  1. We take the newer form into account first form_1.xml and look for savepoints with names that meet the requirements (the file name starts with form_1_ and ends with .xml.save). There is only one file like that so we import it to the database.
  2. Then we take the old version of that form form.xml and try to find a savepoint in the same way. In this case, both files will meet the requirements so depending on which was first on the list that one will be loaded and it might be a savepoint created for the newer version form_1_2024-04-10_01-45-41.xml.save.

That's why I need to load savepoints for blank forms starting from the newest forms and additionally checking if a given savepoint is not imported already.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's something strange happening: if I comment on this if and run the tests, that one sometimes fails and sometimes passes.

Screenshot 2024-05-13 at 13 48 53

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's completely ok. The test that is supposed to fail when you remove that if statement does not always fail because the order of files when you read them from cache is random: https://github.com/getodk/collect/pull/6068/files#diff-a00717447283323def9e47f289dc2227bc7bd76495ff73791e1fb8d6b8b54748R56 when the one that should be used is first then you save it and despite the fact you try to save the second one it will pass because you can't save another savepoint with the same form and instance ids https://github.com/getodk/collect/blob/master/collect_app/src/main/java/org/odk/collect/android/database/savepoints/DatabaseSavepointsRepository.kt#L56
If the order changes and the first one is the one that should be ignored (alreadyused) then it fails.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah because listFiles isn't ordered? We definitely want to avoid non deterministic tests at all costs. The fact that this test will sometimes pass when the code is in a failing state makes it fairly hard to reason about. A common trick here would be to use dependency inversion to extract the "random" element (passing in the files for example) or sort in the code (although this removes control from the test).

That aside, I think there's potentially a different test approach to this problem that also demonstrates the current code needs to be amended (although I might be misunderstanding). Should this following test not pass?

val blankForm = createBlankForm(project, "sampleForm", "1", "1", date = 1)
val savePointFile = createSavepointFile(project, "sampleForm_1_${System.currentTimeMillis()}.xml")

savepointsImporter.run()

val savepoints = projectDependencyProvider.savepointsRepository.getAll()
assertThat(savepoints, empty())

My understanding is that the save point files always have a name structured like <form name>_<timestamp>.xml.save, so a save point named <form name>_1_<timestamp>.xml.save should not be "matched".

Copy link
Member Author

@grzesiek2010 grzesiek2010 May 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then surely I only want to import files that exactly match that format

This is not possible because, in the case of savepoints that belong to blank forms, you don't know the timestamp part of a file name.
To avoid such issues I start importing forms from the newer ones: https://github.com/getodk/collect/pull/6068/files#diff-a00717447283323def9e47f289dc2227bc7bd76495ff73791e1fb8d6b8b54748R52 and detect already used savepoints https://github.com/getodk/collect/pull/6068/files#diff-a00717447283323def9e47f289dc2227bc7bd76495ff73791e1fb8d6b8b54748R61
thanks to that if you have sampleForm with sampleForm_1715764796206.xml.save savepoint and sampleForm_1 with sampleForm_1_1715764796207.xml.save
once we do the job for sampleForm_1 first, when we try to do the same for sampleForm even though sampleForm_1_1715764796207.xml.save is a match it won't be used thanks to https://github.com/getodk/collect/pull/6068/files#diff-a00717447283323def9e47f289dc2227bc7bd76495ff73791e1fb8d6b8b54748R61

but you are right this might be a problem when there is a savepoint for sampleForm_1 but the form does not exist then the only form we want to import a savepoint for is sampleForm and if the first savepoint file taken into account is sampleForm_1_1715764796207.xml.save it will be used.

Is the "usually" there a reference to some fuzziness I'm missing?

I said usually because I imagine people do not change form names when adding changes and bumping the version number and then forms are named with that _1, _2 etc. suffix. If a newer version of a form has a different name then it would be even better because savepoints would have different and easy-to-detect names as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not possible because, in the case of savepoints that belong to blank forms, you don't know the timestamp part of a file name.

Right, but can't we extract the "form name" part of the save point file name by just extracting everything before _<timestamp> without knowing it using a regex (or a similar fuzzy matching approach)? Then we could match save points to forms exactly.

If I have a form with a filename form.xml then I'm looking for a save point named form_<timestamp>.xml.save. I don't need to know the exact timestamp to find that file, I just need to be able to extract everything before a regex matching _<timestamp>.xml.save. The examples we've been talking about use different timestamp formats, but if it's just millis then it'd be something like _\d*.xml.save and if it's a formatted timestamp (like 2024-04-10_01-35-41) then it can be even more specific. I'm not sure if both are possible or just one of them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The timestamp is formated so it's like 2024-04-10_01-35-41 so we can try to substring after second to last underscore maybe? What do you think about this approach c910b73?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This really feels like a good place to use a regular expression (even though I'm useless at them). Playing around with regex101, it seems like we can match the timestamp format with \d{4}-\d{2}-\d{2}_\d{2}-\d{2}-\d{2}. As far as I'm aware, Kotlin doesn't let us do "substring before" with a regex, but we can just cheat and split on it:

val timestampRegex = """\d{4}-\d{2}-\d{2}_\d{2}-\d{2}-\d{2}""".toRegex()
val formName = fileName.split(timestampRegex)[0]

We probably want to use capture groups (()) instead thought so we can ensure we only match on files that are actually save points (and we should have a test for that as well). Again playing around with regex101, I think that'd be something like:

val savepointFileRegex = """(.*)(\d{4}-\d{2}-\d{2}_\d{2}-\d{2}-\d{2})(.xml.save)""".toRegex()
val match = savePointFileRegex.matchEntire(fileName)
if (match != null) {
    val (formName, _, _) = match.destructured
    // Import save point file
}

Where the three groups are: any number of characters, _ followed by a timestamp, literally .xml.save. I guess it could actually just be two groups, but that's how my brain was working!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok now it uses regex. Please review. The tests are failing because of #6138

@seadowg
Copy link
Member

seadowg commented Apr 15, 2024

Just blocking this on #6049 in case we have to make any changes there that affect this.

@grzesiek2010 grzesiek2010 mentioned this pull request May 8, 2024
5 tasks
@seadowg seadowg removed the blocked label May 9, 2024
@seadowg
Copy link
Member

seadowg commented May 13, 2024

@grzesiek2010 this is unblocked again!

Copy link
Member

@seadowg seadowg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@grzesiek2010 grzesiek2010 requested a review from seadowg May 13, 2024 21:09
Copy link
Member

@seadowg seadowg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@grzesiek2010 grzesiek2010 requested a review from seadowg May 15, 2024 08:53
@grzesiek2010 grzesiek2010 force-pushed the COLLECT-5997_1 branch 2 times, most recently from 0e0d01f to cb85a97 Compare May 15, 2024 13:05
@Test
fun ifABlankFormHasASavepointCreatedEarlierThanTheForm_nothingShouldBeImported() {
// create blank forms
createBlankForm(project, "sampleForm", "1", date = System.currentTimeMillis() + TimeInMs.ONE_HOUR)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing we can't used fix times here because of we call the real lastModified on the file right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

Copy link
Member

@seadowg seadowg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! I've left a couple of comments about tweaks to the test, but this is ready for @getodk/testers!

}

@Test
fun ifInCacheThereIsAFileThatLooksLikeASavepointForBlankFormButDoesNotContainTheCorrectSuffixThatIdentifiesSavepoints_nothingShouldBeImported() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could just ifAFileExistsWithMatchingName_butIncorrectSuffix_nothingIsImported. Roughly the same goes for line 240.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

// verify import
val savepoints = projectDependencyProvider.savepointsRepository.getAll()
val expectedSavepoint1 =
Savepoint(blankForm1.dbId, null, savepointFile1.absolutePath, "${projectDependencyProvider.storagePathProvider.getOdkDirPath(StorageSubdirectory.INSTANCES, project.uuid)}/$form1Name/$form1Name.xml")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pulling StoragePathProvider, SavePointsRepository etc out to their own fields would reduce a lot of the noise in these tests.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@grzesiek2010 grzesiek2010 added the high priority Should be looked at before other PRs/issues label May 15, 2024
@dbemke
Copy link

dbemke commented May 16, 2024

@grzesiek2010 Which apk should we use to test upgrades?

@grzesiek2010
Copy link
Member Author

I would test v2024.1 and this pr. Do you have v2024.1 one or you want me to create it for you?

@dbemke
Copy link

dbemke commented May 16, 2024

Do you have v2024.1 one or you want me to create it for you?

It would be great if you could create apks so that we know that all of us test the same apks.

@grzesiek2010
Copy link
Member Author

grzesiek2010 commented May 16, 2024

@WKobus
Copy link

WKobus commented May 17, 2024

Tested with Success!

Verified on devices with Android 14

Verified cases:

  • Savepoint in blank forms and drafts
  • Creating savepoints in 2 different projects and upgrading
  • Savepoints and upgrading in different versions of forms
  • Savepoints and upgrading + deleting, softdeleting forms
  • Creating savepoints using Manual/Previously downloaded forms/Exactly match server Blank form update modes
  • Creating Savepoints using projects on Central and ONA
  • Savepoint on same form in different projects
  • Upgrade with savepoint in audit forms
  • Deleting form with savepoint before upgrade
  • Hide old versions enabled/disabled
  • Screen rotation, minimize app
  • Light and Dark theme

@dbemke
Copy link

dbemke commented May 17, 2024

Tested with Success!

Verified on devices with Android 10

@srujner
Copy link

srujner commented May 17, 2024

Tested with Success!

Verified on devices with Android 13

@grzesiek2010 grzesiek2010 merged commit df28fb6 into getodk:master May 17, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
behavior verified high priority Should be looked at before other PRs/issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement savepoints migration
5 participants