Skip to content
This repository has been archived by the owner on Mar 9, 2019. It is now read-only.

iterate returns duplicate keys #632

Open
xiang90 opened this issue Dec 16, 2016 · 9 comments
Open

iterate returns duplicate keys #632

xiang90 opened this issue Dec 16, 2016 · 9 comments

Comments

@xiang90
Copy link
Contributor

xiang90 commented Dec 16, 2016

I expect iterating a bucket with cursor return keys in order, but it would return duplicated keys.

Here is the data

And here is the script to reproduce the problem

Here is the output:

key="\x00\x00\x00\x00\x00\x00\xfe\x9d_\x00\x00\x00\x00\x00\x00\x00\x00"
key="\x00\x00\x00\x00\x00\x00\xfe\x9c_\x00\x00\x00\x00\x00\x00\x00\x00"
key="\x00\x00\x00\x00\x00\x00\xfe\x9b_\x00\x00\x00\x00\x00\x00\x00\x00"
key="\x00\x00\x00\x00\x00\x00\xfe\x9a_\x00\x00\x00\x00\x00\x00\x00\x00"
duplicated key="\x00\x00\x00\x00\x00\x00\xfe\x9c_\x00\x00\x00\x00\x00\x00\x00\x00"!!!!
duplicated key="\x00\x00\x00\x00\x00\x00\xfe\x9b_\x00\x00\x00\x00\x00\x00\x00\x00"!!!!
duplicated key="\x00\x00\x00\x00\x00\x00\xfe\x9a_\x00\x00\x00\x00\x00\x00\x00\x00"!!!!

Am I missing something?

The key \x00\x00\x00\x00\x00\x00\xfe\x9c_\x00\x00\x00\x00\x00\x00\x00\x00 shows up twice during cursor iterating.

@xiang90
Copy link
Contributor Author

xiang90 commented Dec 16, 2016

I looked into it more. I think actually the db itself is corrupted during page split? (or we failed to reload the nodes correctly)

There are 3 duplicated keys in two adjacent pages:

load index 2 from page 1799
"\x00\x00\x00\x00\x00\x00\xfe\x9c_\x00\x00\x00\x00\x00\x00\x00\x00"
load index 1 from page 1799
"\x00\x00\x00\x00\x00\x00\xfe\x9b_\x00\x00\x00\x00\x00\x00\x00\x00"
load index 0 from page 1799
"\x00\x00\x00\x00\x00\x00\xfe\x9a_\x00\x00\x00\x00\x00\x00\x00\x00"
load index 32 from page 1798
"\x00\x00\x00\x00\x00\x00\xfe\x9c_\x00\x00\x00\x00\x00\x00\x00\x00"
load index 31 from page 1798
"\x00\x00\x00\x00\x00\x00\xfe\x9b_\x00\x00\x00\x00\x00\x00\x00\x00"
load index 30 from page 1798
"\x00\x00\x00\x00\x00\x00\xfe\x9a_\x00\x00\x00\x00\x00\x00\x00\x00"
load index 29 from page 1798

@benbjohnson
Copy link
Member

@xiang90 Are you retaining and altering the []byte that you pass as a key or value to Bucket.Put()?

@xiang90
Copy link
Contributor Author

xiang90 commented Dec 16, 2016

@benbjohnson No. Each key is put only once without any buffering, so no future access. But I will double check.

@xiang90
Copy link
Contributor Author

xiang90 commented Dec 16, 2016

@benbjohnson I suspect the split code is the cause

	// Split inodes across two nodes.
	next.inodes = n.inodes[splitIndex:]
	n.inodes = n.inodes[:splitIndex]

We do not copy the backing array. So if we append one element to n.inodes, it will overwrite the first element of next.inodes.

See my example here:

https://play.golang.org/p/xjI9r6rpow

But this is just my guess by reading limited boltdb code. I am not sure.

@xiang90
Copy link
Contributor Author

xiang90 commented Dec 16, 2016

On the second thought, it might not be the exact cause.

We put keys sequentially. We do not miss any keys, but there are 3 duplicates in between. Basically it looks like:

<------- |dup|------->
1 2 3 4 5 3 4 5 6 7 8 9

@benbjohnson
Copy link
Member

@xiang90 Do you have an example program that writes the data?

@xiang90
Copy link
Contributor Author

xiang90 commented Dec 17, 2016

@benbjohnson

package playground

import (
	"sync"

	"github.com/boltdb/bolt"
)

type store struct {
	sync.Mutex
	revision int
	tx       *bolt.Tx
	db       *bolt.DB
}

func (s *store) put(value []byte) {
	s.Lock()
	defer s.Unlock()

	bucket := tx.Bucket("keys")
	if bucket == nil {
		panic("bucket key does not exist")
	}

	// it is useful to increase fill percent when the workload is seq append.
	// this can delay the page split and reduce space usage.
	bucket.FillPercent = 0.9

	if err := bucket.Put(intToBytes(s.revision), value); err != nil {
		panic(err)
	}
	s.revision++
}

func (s *store) get(rev int) []byte {
	s.Lock()
	defer s.Unlock()

	bucket := tx.Bucket("keys")
	if bucket == nil {
		panic("bucket key does not exist")
	}

	b := bucket.Get(intToBytes(rev))
	nb := make([]byte, len(b))
	copy(nb, b)
	return nb
}

// called every 100ms, every 10000 puts
func (s *store) commit() {
	s.Lock()
	defer s.Unlock()

	if err := s.tx.Commit(); err != nil {
		panic(err)
	}

	var err error
	s.tx, err = s.db.Begin(true)
	if err != nil {
		panic(err)
	}
}

It looks pretty much like this.

@xiang90
Copy link
Contributor Author

xiang90 commented Dec 17, 2016

The put bytes are generated by marshaling a protobuf request, so it will never be accessed again.

The get part is not exactly accurate. we actually unmarshal the data into a protobuf response holding the lock. Protobuf unmarshalling should copy the data.

@jsvisa
Copy link

jsvisa commented Nov 27, 2017

@xiang90 we occured the same issue, how could you fixed it?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants