Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is being dependent on the Check method enough for detecting boltdb corruption? #174

Closed
arjunsingri opened this issue Aug 30, 2019 · 6 comments

Comments

@arjunsingri
Copy link

If I call the Check method when I read from boltdb the first time, is that enough to ensure boltdb is not corrupted? Or do I need to call it periodically within my process that is reading from boltdb? Is there anything else that needs to be done?

https://godoc.org/go.etcd.io/bbolt#Tx.Check

@ptabor
Copy link
Contributor

ptabor commented Jun 26, 2020

Tx.Check is diagnostic method that checks consistency in given point of time using some set of rules.
So far it focuses on relationship between pages:

  • Whether all pages references from the root are reachable
  • Whether all unreachable pages are on the free-pages-list.

Using public API operations database should not get corrupted. But due to a bug, hardware issue, cosmic rays such corruption might happen.
Usually you don't need to call Check from your business-logic application, but for example you might consider checking and alerting whether backups are in the consistent state.

@ptabor
Copy link
Contributor

ptabor commented Jun 26, 2020

BTW: #225 is expanding the Checks to cover also a logical errors (unexpected key order).

@benma
Copy link

benma commented Nov 7, 2022

Usually you don't need to call Check from your business-logic application

I am currently dealing with a corrupt database file in production, where iterating the keys in one bucket give wrong data, and reading another bucket never finishes and eats up all RAM until the process is killed.

I feel forced to do such a consistency check after opening the database to mitigate such issues.

@cenkalti
Copy link
Member

@benma Is it possible for you to share the corrupt database?

@benma
Copy link

benma commented May 18, 2023

@benma Is it possible for you to share the corrupt database?

I found two databases from around that time, but I don't know anymore if these are the ones that ate up all the RAM or if they had different issues such as panics:

  1. rates.db.zip
  2. Crash when trying to open corrupted database #105 (comment)

@cenkalti cenkalti self-assigned this May 19, 2023
@ahrtr
Copy link
Member

ahrtr commented May 31, 2023

rates.db.zip

Pages in [3655, 3715] were somehow reset. All zero values in these pages.

FYI. #520

@github-actions github-actions bot added the stale label May 10, 2024
@ahrtr ahrtr removed the stale label May 10, 2024
@github-actions github-actions bot added the stale label Aug 14, 2024
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Sep 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

5 participants