Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CBG-3899 Write _mou on import, resync #6834

Merged
merged 5 commits into from
May 23, 2024
Merged

CBG-3899 Write _mou on import, resync #6834

merged 5 commits into from
May 23, 2024

Conversation

adamcfraser
Copy link
Collaborator

@adamcfraser adamcfraser commented May 17, 2024

Writes _mou on import and resync, populating cas and pCas. If an imported or resync'd document already has _mou where doc.cas=_mou.cas, updates _mou.cas but preserves existing pCas.

Stores MetadataOnlyUpdate struct in document but not SyncData, to avoid unmarshalling work when mou isn't needed.

CBG-3899

Integration Tests

Writes _mou on import and resync, populating cas and pCas.  If an imported or resync'd document already has _mou where doc.cas=_mou.cas, updates _mou.cas but preserves existing pCas.

Stores MetadataOnlyUpdate struct in document but not SyncData, to avoid unmarshalling work when mou isn't needed.
bbrks
bbrks previously approved these changes May 20, 2024
Copy link
Member

@bbrks bbrks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - will let @torcolvin have a pass

Copy link
Collaborator

@torcolvin torcolvin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologizes if I didn't reason correctly about on demand imports, but I thought it was worth testing at least.

t.Skip("Test requires multi-xattr subdoc operations, CBS 7.6 or higher")
}

db.Options.QueryPaginationLimit = 100 // Required for principal ID query to not deadlock
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I this from copypasta? I don't even see what would do query, so I imagine it might be from query based resync?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's like the comment says - otherwise the principal ID query will deadlock when QueryPaginationLimit is zero, and we normally initialize this outside the db package. Possibly worth cleanup, but not in this PR I don't think.

Comment on lines 512 to 517
err = WaitForConditionWithOptions(t, func() bool {
var status BackgroundManagerStatus
rawStatus, _ := resyncMgr.GetStatus(ctx)
_ = json.Unmarshal(rawStatus, &status)
return status.State == BackgroundProcessStateCompleted
}, 200, 200)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to make this easier to debug for the future, so it will print the status if it does fail:

Suggested change
err = WaitForConditionWithOptions(t, func() bool {
var status BackgroundManagerStatus
rawStatus, _ := resyncMgr.GetStatus(ctx)
_ = json.Unmarshal(rawStatus, &status)
return status.State == BackgroundProcessStateCompleted
}, 200, 200)
require.EventuallyWithT(t, func(c *assert.CollectT) {
var status BackgroundManagerStatus
rawStatus, err := resyncMgr.GetStatus(ctx)
assert.NoError(t, err)
assert.Equal(t, json.Unmarshal(rawStatus, &status)
assert.Equal(t, BackgroundProcessStateComplete, status.State)
}, 40 * time.Second, 200 * time.Millisecond)

@@ -444,6 +445,93 @@ function sync(doc, oldDoc){
return db, ctx
}

// TestResyncMou ensures that resync updates create mou, and preserve pcas in mou in the case where resync is reprocessing an import
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would need a separate test and I'm not sure that this is common, but if you have resync running without auto import, this might be worth testing an inline {{_sync}} attribute.

@@ -401,6 +401,7 @@ func (c *changeCache) DocChanged(event sgbucket.FeedEvent) {
}

// If using xattrs and this isn't an SG write, we shouldn't attempt to cache.
rawUserXattr := rawXattrs[collection.userXattrKey()]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

I'd drop this assignment into this if statement since it isn't needed outside this block

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used below (line 497).


// verify mou contents
if db.Bucket.IsSupported(sgbucket.BucketStoreFeatureMultiXattrSubdocOperations) {
mouXattr, ok := xattrs[base.MouXattrName]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth making sure this xattr is not present?

db/document.go Outdated
}
if err != nil {
return nil, err
}

if len(mouXattrData) > 0 {
if err := base.JSONUnmarshal(mouXattrData, &doc.metadataOnlyUpdate); err != nil {
if err != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is an extra err check here?

db/document.go Outdated
if len(mouXattrData) > 0 {
if err := base.JSONUnmarshal(mouXattrData, &doc.metadataOnlyUpdate); err != nil {
if err != nil {
base.WarnfCtx(ctx, "Failed to unmarshal mouXattr for key %v, will be ignored. Err: %v mou:%s", base.UD(docid), err, mouXattrData)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd clarify that only the _mou is ignored and not unmarshalling the document

db/document.go Outdated
// If the sync xattr is present, use that to build SyncData
if len(rawSyncXattr) > 0 {
if ok && len(rawSyncXattr) > 0 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how this could be !ok and len(rawSyncXattr) > 0 , is there something I'm missing?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it was just my instinct to be more idiomatic and check the ok, but you're right that the length check is sufficient, and any performance difference is negligible.

@@ -24,6 +24,55 @@ import (
"github.com/stretchr/testify/assert"
)

func TestFeedImport(t *testing.T) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this test should test for inline sync metadata, which I think should come with _mou, from the mou from after migrateMetadata.

I think without this that eventing will do a single echo, once after migrate metadata, and once after sync comes in.

My best guess as to how these occur in a real world is a backup/restore.

I also think that potentially there should be a test that only does on demand imports. This might not be a big deal, but we see on demand imports when there are rapid number of writes. It probably won't echo much since there will be an immediate write with different body immediately after.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since migrate isn't a metadata-only update (it modifies the document body) and doesn't modify _mou, I didn't think additional test cases for specifically for migrate (inline sync metadata) were necessary.

I agree about the on-demand import case, I'll add that.

db/document.go Outdated
// computeMetadataOnlyUpdate computes a new metadataOnlyUpdate based on the existing document's CAS and metadataOnlyUpdate
func computeMetadataOnlyUpdate(currentCas uint64, currentMou *MetadataOnlyUpdate) *MetadataOnlyUpdate {
var prevCas string
currentCasString := string(base.Uint64CASToLittleEndianHex(currentCas))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
currentCasString := string(base.Uint64CASToLittleEndianHex(currentCas))
currentCasString := base.CasToString(currentCas)

@adamcfraser adamcfraser merged commit 7359b30 into main May 23, 2024
34 checks passed
@adamcfraser adamcfraser deleted the CBG-3899 branch May 23, 2024 20:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants