-
Notifications
You must be signed in to change notification settings - Fork 6.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nvs: possibility of losing data #34722
Comments
@Laczen do you have a fix in the works? |
To tell you the truth, I don't, I even don't know if it's possible (been challenging myself all morning). |
It'll be hard without storing metadata for the sectors. If you had that you could store some information about what the sector state/use is. |
@lemrey, I agree. One possibility I see is to add a special ate (id = 0xfffe) indicating that gc of the previous sector has been completed. Another possibility would be to change the storage method where the first ate written (at the end of a sector) |
That Is actually quite common behavior among storages, will work good for sure. Option for the open-ATE-counter also sounds reasonable. |
If we add new metadata record for marking sector active - this can be made backward-compatible with previous NVS volumens (NVS sector written before the patch will not have such record). Just need to add this record before the firs Regard porting new storage: |
I see an opportunity to improve nvs. Zephyr has changed a lot since nvs was introduced. Regarding the start of a new sector it is possible to change the sector close item with a sector open item, while maintaining compatibility (the sector close ate is optional as we anyhow needed a way to recover if the close ate was badly written). In the rework I would introduce a lower level api (nvs_ll_xxx) that would enable streaming writes, optional gc, ...) and work more similar to fcb. Nothing would change to the layout on flash, so that compatibility with existing nvs systems would remain. There is only one thing that still bothers me: it seems that a write can be interrupted and nothing is written to flash (remain 0xff), I don't understand how this is possible. Can this be something in the hal layer ? |
NVS is currently used in production in many projects, so a backwards-compatible fix to the current NVS is something that needs to happen, a filesystem needs to provide that level of certainty and maintenance over time. Perhaps the best approach would be to provide a fix for this issue in the current NVS and then start a new "nvs2" project, similar to what was done with "tcp2" that can be developed in parallel. Does that sound reasonable @Laczen ? |
OK, will work on a fix for the present NVS and (slowly) start working on nvs2. |
Fix the possibility of losing data after startup as a result of a badly erased sector. Fixes zephyrproject-rtos#34722. Signed-off-by: Laczen JMS <laczenjms@gmail.com>
Fix the possibility of losing data after startup as a result of a badly erased sector. Fixes #34722. Signed-off-by: Laczen JMS <laczenjms@gmail.com>
Describe the bug
When nvs does a garbage collection to free up space for new writes it ends by doing a flash erase of a sector. When the erase is interrupted due to a power fail the subsequent startup of nvs will detect data in the sector that was being erased and restart a gc operation (erasing the already copied data). As the erase was ongoing the data that is found in the sector might no longer contain valid data.
The result is losing data that was valid before the interrupted erase.
Expected behavior
No data should be lost even if a sector erase is interrupted.
Impact
This could result in a variety of malfunctions depending on the use of the nvs file system.
The text was updated successfully, but these errors were encountered: