Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CBG-3957 add reason in all_dbs for why db is offline #6849

Merged
merged 4 commits into from
May 30, 2024
Merged

Conversation

torcolvin
Copy link
Collaborator

  • fix issue where DatabaseContext.RequiresResync was only updated when
    creating a Database. Update this value after resync completes

I think I'd rather put requires_resync field to match GET /db/ output, but I thought there was a reason that we didn't do that.

Made the fields public only to assert on them in rest.adminapitest package.

Pre-review checklist

  • Removed debug logging (fmt.Print, log.Print, ...)
  • Logging sensitive data? Make sure it's tagged (e.g. base.UD(docID), base.MD(dbName))
  • Updated relevant information in the API specifications (such as endpoint descriptions, schemas, ...) in docs/api

Integration Tests

torcolvin added 2 commits May 30, 2024 13:49
- fix issue where DatabaseContext.RequiresResync was only updated when
  creating a Database. Update this value after resync completes
Copy link

github-actions bot commented May 30, 2024

Redocly previews

@@ -242,6 +242,22 @@ func (r *ResyncManagerDCP) Run(ctx context.Context, options map[string]interface
base.InfofCtx(ctx, base.KeyAll, "[%s] resync was terminated. Docs changed: %d Docs Processed: %d", resyncLoggingID, r.DocsChanged.Value(), r.DocsProcessed.Value())
}

collectionsRequiringResync := make([]base.ScopeAndCollectionName, 0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should only do this work if we successfully updated SyncInfo for any collections (on line 223) - otherwise can skip this (which would include any resync done with regenerate_sequences=false)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, can we just remove any collections that we successfully updated on line 223 from db.RequireResync, instead of recomputing the set?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is much easier to reason through, addressed.

collectionsRequiringResync := make([]base.ScopeAndCollectionName, 0)
for _, collection := range db.CollectionByID {
dataStore := collection.dataStore
resyncRequired, err := base.InitSyncInfo(dataStore, db.DatabaseContext.Options.MetadataID)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like we shouldn't be using InitSyncInfo here - I don't think there's any situation we'd want to be creating SyncInfo if for whatever reason it doesn't exist. But see comment above about maybe just updating db.RequireResync based on line 223.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed with a refactor, I didn't like this either.

rest/api.go Outdated
DBName string `json:"db_name"`
Bucket string `json:"bucket"`
State string `json:"state"`
OfflineReason string `json:"offline_reason,omitempty"`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're only in one state at a time, so I think a simple 'reason' property is preferable. (e.g. in future if we want to provide a reason for state=Resyncing, we can reuse). What do you think?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd prefer to do create a parameter that is RequireResync to match the /db/ endpoint, and there isn't ambiguity.

I think the status field offers a lot of flexibility, but it comes at some downsides, which is that it is not always populated. The reason that reason it might not be populated is opaque, so if the user calls /db/_offline or sets start_offline: true should the reason be offline in that case?

Capella doesn't use the /db/ require_resync field, and implementing this take the place of that field.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since require_resync can potentially be very large, I don't think it's necessarily appropriate for a list REST endpoint. I think it makes sense to keep the _all_dbs response relatively lightweight (to support frequent polling), and keep require_resync available on the /db/ endpoint. That gives consumers the option to fetch the data based on the reason in the list view, but not be committed to it on every polling interval.

@adamcfraser adamcfraser assigned torcolvin and unassigned adamcfraser May 30, 2024
@torcolvin torcolvin assigned adamcfraser and unassigned torcolvin May 30, 2024
@adamcfraser adamcfraser merged commit f344d65 into main May 30, 2024
39 checks passed
@adamcfraser adamcfraser deleted the CBG-3957 branch May 30, 2024 22:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants