-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate old savepoints #6068
Migrate old savepoints #6068
Conversation
926ddde
to
7f95611
Compare
val formFileName = File(form.formFilePath).name.substringBeforeLast(".xml") | ||
|
||
cacheDir.listFiles { file -> | ||
file.name.startsWith("${formFileName}_") && file.name.endsWith(".xml.save") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's easy to import savepoints that belong to saved forms because their names match instanceFilePaths
. When it comes to blank forms it's a little bit cumbersome. We need to find savepoints with names starting with formFilePaths
(we don't know the date of creation of a savepoint that might belong to our blank form which is a part of the savepoint file name so we find files based on their prefix and suffix).
} | ||
|
||
private fun importSavepointsThatBelongToBlankForms(formsRepository: FormsRepository, savepointsRepository: SavepointsRepository, cacheDir: File, instancesDir: File) { | ||
formsRepository.all.sortedByDescending { form -> form.date }.forEach { form -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I sort the forms here so that we start looking for savepoints for the newest ones. Thanks to that newer versions of the same forms will be taken into account first which is important because newer forms usually have the same name like the first one but we add _1
, _2
, etc. suffixes:
form.xml (v1)
form_1.xml (v2)
Savepoints for such forms would be named:
form_2024-04-10_01-35-41.xml.save
form_1_2024-04-10_01-45-41.xml.save
If we start looking for savepoints for older form versions first we could end up with a wrong savepoint loaded. For example, if we started with form.xml (v1)
we would look for a file with a name starting with form_
then both savepoints form_2024-04-10_01-35-41.xml.save
and form_1_2024-04-10_01-45-41.xml.save
would match and the one that belongs to the newer version of this form could be used. If we start with the newer one we look for a file with a name starting with form_1_
. In this case only one file matches and we are safe (because below we check if this file was already loaded to the database).
`
collect_app/src/main/java/org/odk/collect/android/injection/config/AppDependencyModule.java
Outdated
Show resolved
Hide resolved
...t_app/src/main/java/org/odk/collect/android/application/initialization/SavepointsImporter.kt
Outdated
Show resolved
Hide resolved
...t_app/src/main/java/org/odk/collect/android/application/initialization/SavepointsImporter.kt
Outdated
Show resolved
Hide resolved
}?.forEach { savepointFile -> | ||
if (savepointFile.lastModified() > form.date) { | ||
val alreadyUsed = savepointsRepository.getAll().any { savepoint -> savepoint.savepointFilePath == savepointFile.absolutePath } | ||
if (!alreadyUsed) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this possible? I don't see a test for this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test will fail ifThereAreMultipleVersionsOfTheSameBlankFormWithSavepoints_allSavepointsShouldBeImported
and it's related to what I said here #6068 (comment)
Imaging you have two versions of the same form:
form.xml
(v1) with a savepoint form_2024-04-10_01-35-41.xml.save
form_1.xml
(v2) with a savepoint form_1_2024-04-10_01-45-41.xml.save
during importing it will look like:
- We take the newer form into account first
form_1.xml
and look for savepoints with names that meet the requirements (the file name starts withform_1_
and ends with.xml.save
). There is only one file like that so we import it to the database. - Then we take the old version of that form
form.xml
and try to find a savepoint in the same way. In this case, both files will meet the requirements so depending on which was first on the list that one will be loaded and it might be a savepoint created for the newer versionform_1_2024-04-10_01-45-41.xml.save
.
That's why I need to load savepoints for blank forms starting from the newest forms and additionally checking if a given savepoint is not imported already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it's completely ok. The test that is supposed to fail when you remove that if statement does not always fail because the order of files when you read them from cache is random: https://github.com/getodk/collect/pull/6068/files#diff-a00717447283323def9e47f289dc2227bc7bd76495ff73791e1fb8d6b8b54748R56 when the one that should be used is first then you save it and despite the fact you try to save the second one it will pass because you can't save another savepoint with the same form and instance ids https://github.com/getodk/collect/blob/master/collect_app/src/main/java/org/odk/collect/android/database/savepoints/DatabaseSavepointsRepository.kt#L56
If the order changes and the first one is the one that should be ignored (alreadyused) then it fails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah because listFiles
isn't ordered? We definitely want to avoid non deterministic tests at all costs. The fact that this test will sometimes pass when the code is in a failing state makes it fairly hard to reason about. A common trick here would be to use dependency inversion to extract the "random" element (passing in the files for example) or sort in the code (although this removes control from the test).
That aside, I think there's potentially a different test approach to this problem that also demonstrates the current code needs to be amended (although I might be misunderstanding). Should this following test not pass?
val blankForm = createBlankForm(project, "sampleForm", "1", "1", date = 1)
val savePointFile = createSavepointFile(project, "sampleForm_1_${System.currentTimeMillis()}.xml")
savepointsImporter.run()
val savepoints = projectDependencyProvider.savepointsRepository.getAll()
assertThat(savepoints, empty())
My understanding is that the save point files always have a name structured like <form name>_<timestamp>.xml.save
, so a save point named <form name>_1_<timestamp>.xml.save
should not be "matched".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then surely I only want to import files that exactly match that format
This is not possible because, in the case of savepoints that belong to blank forms, you don't know the timestamp part of a file name.
To avoid such issues I start importing forms from the newer ones: https://github.com/getodk/collect/pull/6068/files#diff-a00717447283323def9e47f289dc2227bc7bd76495ff73791e1fb8d6b8b54748R52 and detect already used savepoints https://github.com/getodk/collect/pull/6068/files#diff-a00717447283323def9e47f289dc2227bc7bd76495ff73791e1fb8d6b8b54748R61
thanks to that if you have sampleForm
with sampleForm_1715764796206.xml.save
savepoint and sampleForm_1
with sampleForm_1_1715764796207.xml.save
once we do the job for sampleForm_1
first, when we try to do the same for sampleForm
even though sampleForm_1_1715764796207.xml.save
is a match it won't be used thanks to https://github.com/getodk/collect/pull/6068/files#diff-a00717447283323def9e47f289dc2227bc7bd76495ff73791e1fb8d6b8b54748R61
but you are right this might be a problem when there is a savepoint for sampleForm_1
but the form does not exist then the only form we want to import a savepoint for is sampleForm
and if the first savepoint file taken into account is sampleForm_1_1715764796207.xml.save
it will be used.
Is the "usually" there a reference to some fuzziness I'm missing?
I said usually because I imagine people do not change form names when adding changes and bumping the version number and then forms are named with that _1, _2 etc. suffix. If a newer version of a form has a different name then it would be even better because savepoints would have different and easy-to-detect names as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not possible because, in the case of savepoints that belong to blank forms, you don't know the timestamp part of a file name.
Right, but can't we extract the "form name" part of the save point file name by just extracting everything before _<timestamp>
without knowing it using a regex (or a similar fuzzy matching approach)? Then we could match save points to forms exactly.
If I have a form with a filename form.xml
then I'm looking for a save point named form_<timestamp>.xml.save
. I don't need to know the exact timestamp to find that file, I just need to be able to extract everything before a regex matching _<timestamp>.xml.save
. The examples we've been talking about use different timestamp formats, but if it's just millis then it'd be something like _\d*.xml.save
and if it's a formatted timestamp (like 2024-04-10_01-35-41
) then it can be even more specific. I'm not sure if both are possible or just one of them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The timestamp is formated so it's like 2024-04-10_01-35-41
so we can try to substring after second to last underscore maybe? What do you think about this approach c910b73?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This really feels like a good place to use a regular expression (even though I'm useless at them). Playing around with regex101, it seems like we can match the timestamp format with \d{4}-\d{2}-\d{2}_\d{2}-\d{2}-\d{2}
. As far as I'm aware, Kotlin doesn't let us do "substring before" with a regex, but we can just cheat and split on it:
val timestampRegex = """\d{4}-\d{2}-\d{2}_\d{2}-\d{2}-\d{2}""".toRegex()
val formName = fileName.split(timestampRegex)[0]
We probably want to use capture groups (()
) instead thought so we can ensure we only match on files that are actually save points (and we should have a test for that as well). Again playing around with regex101, I think that'd be something like:
val savepointFileRegex = """(.*)(\d{4}-\d{2}-\d{2}_\d{2}-\d{2}-\d{2})(.xml.save)""".toRegex()
val match = savePointFileRegex.matchEntire(fileName)
if (match != null) {
val (formName, _, _) = match.destructured
// Import save point file
}
Where the three groups are: any number of characters, _
followed by a timestamp, literally .xml.save
. I guess it could actually just be two groups, but that's how my brain was working!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok now it uses regex. Please review. The tests are failing because of #6138
forms-test/src/main/java/org/odk/collect/formstest/InstanceUtils.kt
Outdated
Show resolved
Hide resolved
Just blocking this on #6049 in case we have to make any changes there that affect this. |
@grzesiek2010 this is unblocked again! |
7f95611
to
65155df
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0e0d01f
to
cb85a97
Compare
@Test | ||
fun ifABlankFormHasASavepointCreatedEarlierThanTheForm_nothingShouldBeImported() { | ||
// create blank forms | ||
createBlankForm(project, "sampleForm", "1", date = System.currentTimeMillis() + TimeInMs.ONE_HOUR) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing we can't used fix times here because of we call the real lastModified
on the file right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! I've left a couple of comments about tweaks to the test, but this is ready for @getodk/testers!
} | ||
|
||
@Test | ||
fun ifInCacheThereIsAFileThatLooksLikeASavepointForBlankFormButDoesNotContainTheCorrectSuffixThatIdentifiesSavepoints_nothingShouldBeImported() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this could just ifAFileExistsWithMatchingName_butIncorrectSuffix_nothingIsImported
. Roughly the same goes for line 240.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
// verify import | ||
val savepoints = projectDependencyProvider.savepointsRepository.getAll() | ||
val expectedSavepoint1 = | ||
Savepoint(blankForm1.dbId, null, savepointFile1.absolutePath, "${projectDependencyProvider.storagePathProvider.getOdkDirPath(StorageSubdirectory.INSTANCES, project.uuid)}/$form1Name/$form1Name.xml") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pulling StoragePathProvider
, SavePointsRepository
etc out to their own fields would reduce a lot of the noise in these tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
cb85a97
to
16515b3
Compare
@grzesiek2010 Which apk should we use to test upgrades? |
I would test v2024.1 and this pr. Do you have v2024.1 one or you want me to create it for you? |
It would be great if you could create apks so that we know that all of us test the same apks. |
Tested with Success! Verified on devices with Android 14 Verified cases:
|
Tested with Success! Verified on devices with Android 10 |
Tested with Success! Verified on devices with Android 13 |
Closes #5997
Blocked by #6049Why is this the best possible solution? Were any other approaches considered?
#5951 has introduced a new way of handling savepoints that relies on a database. In the older version savepoints were created as files and detected by file names. Now they are still saved as files in the same way but the mechanism of finding savepoints has been improved using the database mentioned above. Because of that, we needed to import old savepoints to our new database.
How does this change affect users? Describe intentional changes to behavior and behavior that could have accidentally been affected by code changes. In other words, what are the regression risks?
To make sure importing old savepoints works well you need to:
This might be tricky so please be careful. Reading the tests I have added (at least the titles) should help you understand the complexity: https://github.com/getodk/collect/pull/6068/files#diff-88b7def45e9dbbc62a204360e82832c2704e8b542d114ad5c1fba08ac145754aR25
Please also make sure that newly created savepoints (those created with the app after upgrading) work well too.
Do we need any specific form for testing your changes? If so, please attach one.
No.
Does this change require updates to documentation? If so, please file an issue here and include the link below.
No.
Before submitting this PR, please make sure you have:
./gradlew connectedAndroidTest
(or./gradlew testLab
) and confirmed all checks still pass