Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BoltDB panics on cursor search since April update #3162

Closed
vansante opened this issue May 7, 2018 · 16 comments
Closed

BoltDB panics on cursor search since April update #3162

vansante opened this issue May 7, 2018 · 16 comments

Comments

@vansante
Copy link

vansante commented May 7, 2018

  • Your Windows build number:
    Microsoft Windows [Version 10.0.17134.1]

  • What you're doing and what's happening

  1. Make sure GO is installed
  2. run: go get github.com/boltdb/bolt
  3. Put the following code in a file named main.go:
package main

import (
	"fmt"
	"log"
	"os"

	"github.com/boltdb/bolt"
)

func main() {
	os.Remove("test.db")
	db, err := bolt.Open("test.db", 0600, nil)
	if err != nil {
		log.Fatal(err)
	}
	defer db.Close()

	db.Update(func(tx *bolt.Tx) error {
		_, err := tx.CreateBucket([]byte("MyBucket"))
		if err != nil {
			return fmt.Errorf("create bucket: %s", err)
		}
		return err
	})

	db.View(func(tx *bolt.Tx) error {
		b := tx.Bucket([]byte("MyBucket"))

		c := b.Cursor()

		c.Seek([]byte("test"))

		return nil
	})
	os.Remove("test.db")
}
  1. Run go build main.go
  2. Run ./main
  • What's wrong / what should be happening instead:
    BoltDB panics with:
panic: invalid page type: 0: 4

goroutine 1 [running]:
github.com/boltdb/bolt.(*Cursor).search(0xc420043da8, 0xc420043e80, 0x8, 0x20, 0x4)
        /mnt/d/go/src/github.com/boltdb/bolt/cursor.go:256 +0x388
github.com/boltdb/bolt.(*Cursor).seek(0xc420043da8, 0xc420043e80, 0x8, 0x20, 0x0, 0x0, 0xc420043d80, 0x8, 0x8, 0xc42000c040, ...)
        /mnt/d/go/src/github.com/boltdb/bolt/cursor.go:159 +0xa5
github.com/boltdb/bolt.(*Bucket).Bucket(0xc4200780f8, 0xc420043e80, 0x8, 0x20, 0xc420043e80)
        /mnt/d/go/src/github.com/boltdb/bolt/bucket.go:112 +0xde
github.com/boltdb/bolt.(*Tx).Bucket(0xc4200780e0, 0xc420043e80, 0x8, 0x20, 0x8)
        /mnt/d/go/src/github.com/boltdb/bolt/tx.go:101 +0x4f
main.main.func2(0xc4200780e0, 0x4e5f08, 0xc4200780e0)
        /mnt/d/main.go:28 +0x81
github.com/boltdb/bolt.(*DB).View(0xc420072000, 0x4e5f38, 0x0, 0x0)
        /mnt/d/go/src/github.com/boltdb/bolt/db.go:629 +0x90
main.main()
        /mnt/d/main.go:27 +0x10f
exit status 2

I expect no panics for this code :)

@StianOvrevage
Copy link

Same here. This causes etcd to crash which again causes test problems with kubebuilder: kubernetes-sigs/kubebuilder#364

@ghost
Copy link

ghost commented Sep 21, 2018

This is still an ongoing issue. =/

total 4.0K
drwxr-xr-x 1 sam sam 4.0K Sep 21 15:04 .
drwxr-xr-x 1 sam sam 4.0K Sep 21 15:04 ..
-rwxr-xr-x 1 sam sam 1.1K Sep 21 15:04 justbolt.go
sam@DESKTOP-3VQFLLE:~/go/src/bolt-test$ cat justbolt.go 
package main

import (
	"log"
	"fmt"
	"github.com/boltdb/bolt"
	"crypto/rand"
	"encoding/hex"
)

var db *bolt.DB




func main() {
	// Open the my.db data file in your current directory.
	// It will be created if it doesn't exist.
	db, err := bolt.Open("my.db", 0600, nil)
	if err != nil {
		log.Println("Failure at open()")
		log.Fatal(err)
	}
	defer db.Close()


	fmt.Println("Starting....")
	for i := 0; i < 500; i++ {
		// Begin a read/write Operation.
		go func(){
			db.Update(func(tx *bolt.Tx) error {

				// Create the bucket if it doesn't exists
				b, err := tx.CreateBucketIfNotExists([]byte("MyBucket"))
				if err != nil {
					return fmt.Errorf("create bucket: %s", err)
				}

				// Insert random shit... lots of it.
				err = b.Put([]byte(RandBytes()), []byte(RandBytes()))

				if err != nil {
					fmt.Println(err)
				}
				return err
			})
		}()
	}
	fmt.Println("Done...")

}

func RandBytes() string {
	c := 10
	b := make([]byte, c)
	_, err := rand.Read(b)
	if err != nil {
		fmt.Println("error:", err)
		return ""
	}
	return hex.EncodeToString(b)
}

sam@DESKTOP-3VQFLLE:~/go/src/bolt-test$ go run justbolt.go 
Starting....
Done...
panic: invalid page type: 0: 4

goroutine 11 [running]:
github.com/boltdb/bolt.(*Cursor).search(0xc000098d38, 0xc0000c80c0, 0x8, 0x8, 0x4)
	/home/sam/go/src/github.com/boltdb/bolt/cursor.go:256 +0x371
github.com/boltdb/bolt.(*Cursor).seek(0xc000098d38, 0xc0000c80c0, 0x8, 0x8, 0x0, 0x0, 0x2, 0x1, 0x0, 0xc00003ed20, ...)
	/home/sam/go/src/github.com/boltdb/bolt/cursor.go:159 +0xa5
github.com/boltdb/bolt.(*Bucket).CreateBucket(0xc0000ba0f8, 0xc0000c80c0, 0x8, 0x8, 0x10, 0x0, 0x0)
	/home/sam/go/src/github.com/boltdb/bolt/bucket.go:172 +0x101
github.com/boltdb/bolt.(*Bucket).CreateBucketIfNotExists(0xc0000ba0f8, 0xc0000c80c0, 0x8, 0x8, 0x8, 0x8, 0xc0000c80c0)
	/home/sam/go/src/github.com/boltdb/bolt/bucket.go:206 +0x4d
github.com/boltdb/bolt.(*Tx).CreateBucketIfNotExists(0xc0000ba0e0, 0xc0000c80c0, 0x8, 0x8, 0x8, 0x8, 0x49efa1)
	/home/sam/go/src/github.com/boltdb/bolt/tx.go:115 +0x4f
main.main.func1.1(0xc0000ba0e0, 0x4fe398, 0xc0000ba0e0)
	/home/sam/go/src/bolt-test/justbolt.go:34 +0x80
github.com/boltdb/bolt.(*DB).Update(0xc000092000, 0x4fe3c8, 0x0, 0x0)
	/home/sam/go/src/github.com/boltdb/bolt/db.go:598 +0x90
main.main.func1(0xc000092000)
	/home/sam/go/src/bolt-test/justbolt.go:31 +0x37
created by main.main
	/home/sam/go/src/bolt-test/justbolt.go:30 +0x106
exit status 2
sam@DESKTOP-3VQFLLE:~/go/src/bolt-test$ ls -lah
total 36K
drwxr-xr-x 1 sam sam 4.0K Sep 21 15:05 .
drwxr-xr-x 1 sam sam 4.0K Sep 21 15:04 ..
-rwxr-xr-x 1 sam sam 1.1K Sep 21 15:04 justbolt.go
-rw------- 1 sam sam  32K Sep 21 15:05 my.db
sam@DESKTOP-3VQFLLE:~/go/src/bolt-test$ go run justbolt.go 
Starting....
Done...
panic: invalid page type: 0: 4

goroutine 57 [running]:
github.com/boltdb/bolt.(*Cursor).search(0xc000096d38, 0xc000324a18, 0x8, 0x8, 0x8)
	/home/sam/go/src/github.com/boltdb/bolt/cursor.go:256 +0x371
github.com/boltdb/bolt.(*Cursor).seek(0xc000096d38, 0xc000324a18, 0x8, 0x8, 0x0, 0x0, 0x2, 0x1, 0x40ab00, 0xc0000e7d20, ...)
	/home/sam/go/src/github.com/boltdb/bolt/cursor.go:159 +0xa5
github.com/boltdb/bolt.(*Bucket).CreateBucket(0xc000378018, 0xc000324a18, 0x8, 0x8, 0x8, 0x0, 0x0)
	/home/sam/go/src/github.com/boltdb/bolt/bucket.go:172 +0x101
github.com/boltdb/bolt.(*Bucket).CreateBucketIfNotExists(0xc000378018, 0xc000324a18, 0x8, 0x8, 0x8, 0x8, 0xc000324a18)
	/home/sam/go/src/github.com/boltdb/bolt/bucket.go:206 +0x4d
github.com/boltdb/bolt.(*Tx).CreateBucketIfNotExists(0xc000378000, 0xc000324a18, 0x8, 0x8, 0x8, 0x8, 0x49efa1)
	/home/sam/go/src/github.com/boltdb/bolt/tx.go:115 +0x4f
main.main.func1.1(0xc000378000, 0x4fe398, 0xc000378000)
	/home/sam/go/src/bolt-test/justbolt.go:34 +0x80
github.com/boltdb/bolt.(*DB).Update(0xc000092000, 0x4fe3c8, 0x0, 0x0)
	/home/sam/go/src/github.com/boltdb/bolt/db.go:598 +0x90
main.main.func1(0xc000092000)
	/home/sam/go/src/bolt-test/justbolt.go:31 +0x37
created by main.main
	/home/sam/go/src/bolt-test/justbolt.go:30 +0x106
exit status 2
sam@DESKTOP-3VQFLLE:~/go/src/bolt-test$ ls -lah
total 68K
drwxr-xr-x 1 sam sam 4.0K Sep 21 15:05 .
drwxr-xr-x 1 sam sam 4.0K Sep 21 15:04 ..
-rwxr-xr-x 1 sam sam 1.1K Sep 21 15:04 justbolt.go
-rw------- 1 sam sam  64K Sep 21 15:05 my.db
sam@DESKTOP-3VQFLLE:~/go/src/bolt-test$ go run justbolt.go 
Starting....
Done...
panic: cannot free page 0 or 1: 0

goroutine 336 [running]:
github.com/boltdb/bolt.(*freelist).free(0xc00006c2a0, 0x196, 0x7ff028ca0000)
	/home/sam/go/src/github.com/boltdb/bolt/freelist.go:113 +0x398
github.com/boltdb/bolt.(*Tx).Commit(0xc00036a8c0, 0x0, 0x0)
	/home/sam/go/src/github.com/boltdb/bolt/tx.go:176 +0x1e8
github.com/boltdb/bolt.(*DB).Update(0xc00007e000, 0x4fe3c8, 0x0, 0x0)
	/home/sam/go/src/github.com/boltdb/bolt/db.go:605 +0xe8
main.main.func1(0xc00007e000)
	/home/sam/go/src/bolt-test/justbolt.go:31 +0x37
created by main.main
	/home/sam/go/src/bolt-test/justbolt.go:30 +0x106
exit status 2
sam@DESKTOP-3VQFLLE:~/go/src/bolt-test$ ls -lah
total 132K
drwxr-xr-x 1 sam sam 4.0K Sep 21 15:05 .
drwxr-xr-x 1 sam sam 4.0K Sep 21 15:04 ..
-rwxr-xr-x 1 sam sam 1.1K Sep 21 15:04 justbolt.go
-rw------- 1 sam sam 128K Sep 21 15:05 my.db
sam@DESKTOP-3VQFLLE:~/go/src/bolt-test$ go run justbolt.go 
Starting....
Done...
sam@DESKTOP-3VQFLLE:~/go/src/bolt-test$ ls -lah
total 132K
drwxr-xr-x 1 sam sam 4.0K Sep 21 15:05 .
drwxr-xr-x 1 sam sam 4.0K Sep 21 15:04 ..
-rwxr-xr-x 1 sam sam 1.1K Sep 21 15:04 justbolt.go
-rw------- 1 sam sam 128K Sep 21 15:05 my.db
sam@DESKTOP-3VQFLLE:~/go/src/bolt-test$ go run justbolt.go 
Starting....
Done...
panic: cannot free page 0 or 1: 0

goroutine 154 [running]:
github.com/boltdb/bolt.(*freelist).free(0xc00006c2a0, 0x412, 0x7fd219c30000)
	/home/sam/go/src/github.com/boltdb/bolt/freelist.go:113 +0x398
github.com/boltdb/bolt.(*Tx).Commit(0xc000500000, 0x0, 0x0)
	/home/sam/go/src/github.com/boltdb/bolt/tx.go:176 +0x1e8
github.com/boltdb/bolt.(*DB).Update(0xc00007e000, 0x4fe3c8, 0x0, 0x0)
	/home/sam/go/src/github.com/boltdb/bolt/db.go:605 +0xe8
main.main.func1(0xc00007e000)
	/home/sam/go/src/bolt-test/justbolt.go:31 +0x37
created by main.main
	/home/sam/go/src/bolt-test/justbolt.go:30 +0x106
exit status 2
sam@DESKTOP-3VQFLLE:~/go/src/bolt-test$ go run justbolt.go 
Starting....
Done...

@oneumyvakin
Copy link

Looks like this issue in BoltDB cause to crashing docker:

panic: invalid page type: 0: 4

goroutine 1 [running]:
github.com/docker/docker/vendor/github.com/boltdb/bolt.(*Cursor).search(0xc42052ecb0, 0x7f18c9286f80, 0x7, 0x7, 0x4)
        /go/src/github.com/docker/docker/vendor/github.com/boltdb/bolt/cursor.go:256 +0x38a
github.com/docker/docker/vendor/github.com/boltdb/bolt.(*Cursor).seek(0xc42052ecb0, 0x7f18c9286f80, 0x7, 0x7, 0x0, 0x0, 0xc42052ecf8, 0x7f18c4de06c8, 0x0, 0xc42034c470, ...)
        /go/src/github.com/docker/docker/vendor/github.com/boltdb/bolt/cursor.go:159 +0xa7
github.com/docker/docker/vendor/github.com/boltdb/bolt.(*Bucket).Bucket(0xc420424558, 0x7f18c9286f80, 0x7, 0x7, 0x7f18c6fe4b98)
        /go/src/github.com/docker/docker/vendor/github.com/boltdb/bolt/bucket.go:112 +0xe0
github.com/docker/docker/vendor/github.com/boltdb/bolt.(*Tx).Bucket(0xc420424540, 0x7f18c9286f80, 0x7, 0x7, 0x0)
        /go/src/github.com/docker/docker/vendor/github.com/boltdb/bolt/tx.go:101 +0x51
github.com/docker/docker/volume/service.listMeta(0xc420424540, 0xc400000008, 0xc420324000, 0x0)
        /go/src/github.com/docker/docker/volume/service/db.go:78 +0x66
github.com/docker/docker/volume/service.(*VolumeStore).restore.func1(0xc420424540, 0x7f18c86e9e70, 0xc420424540)
        /go/src/github.com/docker/docker/volume/service/restore.go:21 +0x36
github.com/docker/docker/vendor/github.com/boltdb/bolt.(*DB).View(0xc42034e1e0, 0xc42052eeb0, 0x0, 0x0)
        /go/src/github.com/docker/docker/vendor/github.com/boltdb/bolt/db.go:626 +0x92
github.com/docker/docker/volume/service.(*VolumeStore).restore(0xc420344a00)
        /go/src/github.com/docker/docker/volume/service/restore.go:20 +0x8c
github.com/docker/docker/volume/service.NewStore(0x7f18c7ae98ab, 0xf, 0xc42054e960, 0x0, 0x0, 0x0)
        /go/src/github.com/docker/docker/volume/service/store.go:107 +0x114
github.com/docker/docker/volume/service.NewVolumeService(0x7f18c7ae98ab, 0xf, 0x7f18c8723760, 0xc42033a090, 0x0, 0x0, 0x7f18c86ff8e0, 0xc420266000, 0x0, 0xc42033a1b0, ...)
        /go/src/github.com/docker/docker/volume/service/service.go:44 +0x147
github.com/docker/docker/daemon.NewDaemon(0xc420422000, 0x7f18c8734900, 0xc420240b40, 0x7f18c8707a20, 0xc42038a000, 0xc42033a090, 0x0, 0x0, 0x0)
        /go/src/github.com/docker/docker/daemon/daemon.go:785 +0x1557
main.(*DaemonCli).start(0xc42023e6f0, 0xc4202eeea0, 0x0, 0x0)
        /go/src/github.com/docker/docker/cmd/dockerd/daemon.go:165 +0x97e
main.runDaemon(0xc4202eeea0, 0xc420488200, 0x0)
        /go/src/github.com/docker/docker/cmd/dockerd/docker_unix.go:7 +0x47
main.newDaemonCommand.func1(0xc420096000, 0xc42045fe00, 0x0, 0x2, 0x0, 0x0)
        /go/src/github.com/docker/docker/cmd/dockerd/docker.go:29 +0x5d
github.com/docker/docker/vendor/github.com/spf13/cobra.(*Command).execute(0xc420096000, 0xc42003a190, 0x2, 0x2, 0xc420096000, 0xc42003a190)
        /go/src/github.com/docker/docker/vendor/github.com/spf13/cobra/command.go:762 +0x46a
github.com/docker/docker/vendor/github.com/spf13/cobra.(*Command).ExecuteC(0xc420096000, 0x7f18c86f9ac0, 0x7f18c82d4960, 0x7f18c86f9ad0)
        /go/src/github.com/docker/docker/vendor/github.com/spf13/cobra/command.go:852 +0x30c
github.com/docker/docker/vendor/github.com/spf13/cobra.(*Command).Execute(0xc420096000, 0xc42000e020, 0x7f18c66371cf)
        /go/src/github.com/docker/docker/vendor/github.com/spf13/cobra/command.go:800 +0x2d
main.main()
        /go/src/github.com/docker/docker/cmd/dockerd/docker.go:70 +0xa2

@ghost
Copy link

ghost commented Nov 6, 2018

I believe this is not a boltdb issue, but an issue with how WSL handles mmap calls from what I've observed.

@sempervictus
Copy link

This also apparently happens on top of ZFS, speciflcally from an inside an Ubuntu LXC container on an Arch host which actually runs the ZFS storage pool on which the container resides:

Dec 23 07:51:49 containerd[10083]: time="2018-12-23T07:51:49.560429129Z" level=info msg="starting containerd" revision=c4446665cb9c30056f4998ed953e6d4ff22c7c39 version=1.2.0
Dec 23 07:51:49 containerd[10083]: time="2018-12-23T07:51:49.560943286Z" level=info msg="loading plugin "io.containerd.content.v1.content"..." type=io.containerd.content.v1
Dec 23 07:51:49 containerd[10083]: time="2018-12-23T07:51:49.561012123Z" level=info msg="loading plugin "io.containerd.snapshotter.v1.btrfs"..." type=io.containerd.snapshotter.v1
Dec 23 07:51:49 containerd[10083]: time="2018-12-23T07:51:49.561283392Z" level=warning msg="failed to load plugin io.containerd.snapshotter.v1.btrfs" error="path /var/lib/containerd/io.containerd.snapshotter.v1.btrfs must be a btrfs filesystem to be used with the btrfs snapshotter"
Dec 23 07:51:49 containerd[10083]: time="2018-12-23T07:51:49.561307339Z" level=info msg="loading plugin "io.containerd.snapshotter.v1.aufs"..." type=io.containerd.snapshotter.v1
Dec 23 07:51:49 containerd[10083]: time="2018-12-23T07:51:49.563268640Z" level=warning msg="failed to load plugin io.containerd.snapshotter.v1.aufs" error="modprobe aufs failed: "modprobe: FATAL: Module aufs not found in directory /lib/modules/4.14.82\n": exit status 1"
Dec 23 07:51:49 containerd[10083]: time="2018-12-23T07:51:49.563297185Z" level=info msg="loading plugin "io.containerd.snapshotter.v1.native"..." type=io.containerd.snapshotter.v1
Dec 23 07:51:49 containerd[10083]: time="2018-12-23T07:51:49.563345551Z" level=info msg="loading plugin "io.containerd.snapshotter.v1.overlayfs"..." type=io.containerd.snapshotter.v1
Dec 23 07:51:49 containerd[10083]: time="2018-12-23T07:51:49.563536088Z" level=info msg="loading plugin "io.containerd.snapshotter.v1.zfs"..." type=io.containerd.snapshotter.v1
Dec 23 07:51:49 containerd[10083]: time="2018-12-23T07:51:49.563857555Z" level=warning msg="failed to load plugin io.containerd.snapshotter.v1.zfs" error="exec: "zfs": executable file not found in $PATH: "zfs zfs list -Hp -o name,origin,used,available,mountpoint,compression,type,volsize,quota,referenced,written,logicalused,usedbydataset svdc-is03-1t4sz/lxc/sv-so00-broken"
Dec 23 07:51:49 containerd[10083]: time="2018-12-23T07:51:49.563893055Z" level=info msg="loading plugin "io.containerd.metadata.v1.bolt"..." type=io.containerd.metadata.v1
Dec 23 07:51:49 containerd[10083]: time="2018-12-23T07:51:49.563923105Z" level=warning msg="could not use snapshotter btrfs in metadata plugin" error="path /var/lib/containerd/io.containerd.snapshotter.v1.btrfs must be a btrfs filesystem to be used with the btrfs snapshotter"
Dec 23 07:51:49 containerd[10083]: time="2018-12-23T07:51:49.563938615Z" level=warning msg="could not use snapshotter aufs in metadata plugin" error="modprobe aufs failed: "modprobe: FATAL: Module aufs not found in directory /lib/modules/4.14.82\n": exit status 1"
Dec 23 07:51:49 containerd[10083]: time="2018-12-23T07:51:49.563960336Z" level=warning msg="could not use snapshotter zfs in metadata plugin" error="exec: "zfs": executable file not found in $PATH: "zfs zfs list -Hp -o name,origin,used,available,mountpoint,compression,type,volsize,quota,referenced,written,logicalused,usedbydataset svdc-is03-1t4sz/lxc/sv-so00-broken" => "
Dec 23 07:51:49 containerd[10083]: panic: invalid page type: 8030604429271643239: 6570
Dec 23 07:51:49 containerd[10083]: goroutine 1 [running]:
Dec 23 07:51:49 containerd[10083]: github.com/containerd/containerd/vendor/go.etcd.io/bbolt.(*Cursor).search(0xc42050b298, 0xc42050b398, 0x2, 0x20, 0x1b)
Dec 23 07:51:49 containerd[10083]:         /go/src/github.com/containerd/containerd/vendor/go.etcd.io/bbolt/cursor.go:250 +0x38a
Dec 23 07:51:49 containerd[10083]: github.com/containerd/containerd/vendor/go.etcd.io/bbolt.(*Cursor).seek(0xc42050b298, 0xc42050b398, 0x2, 0x20, 0x0, 0x0, 0x2f6528fe680, 0x2f6530d7880, 0x0, 0x0, ...)
Dec 23 07:51:49 containerd[10083]:         /go/src/github.com/containerd/containerd/vendor/go.etcd.io/bbolt/cursor.go:159 +0xa7
Dec 23 07:51:49 containerd[10083]: github.com/containerd/containerd/vendor/go.etcd.io/bbolt.(*Bucket).Bucket(0xc420424018, 0xc42050b398, 0x2, 0x20, 0xc42050b398)
Dec 23 07:51:49 containerd[10083]:         /go/src/github.com/containerd/containerd/vendor/go.etcd.io/bbolt/bucket.go:105 +0xe0
Dec 23 07:51:49 containerd[10083]: github.com/containerd/containerd/vendor/go.etcd.io/bbolt.(*Tx).Bucket(0xc420424000, 0xc42050b398, 0x2, 0x20, 0x2)
Dec 23 07:51:49 containerd[10083]:         /go/src/github.com/containerd/containerd/vendor/go.etcd.io/bbolt/tx.go:101 +0x51
Dec 23 07:51:49 containerd[10083]: github.com/containerd/containerd/metadata.(*DB).Init.func1(0xc420424000, 0x2f6528daf68, 0xc420424000)
Dec 23 07:51:49 containerd[10083]:         /go/src/github.com/containerd/containerd/metadata/db.go:121 +0x174
Dec 23 07:51:49 containerd[10083]: github.com/containerd/containerd/vendor/go.etcd.io/bbolt.(*DB).Update(0xc420422000, 0xc42050b520, 0x0, 0x0)
Dec 23 07:51:49 containerd[10083]:         /go/src/github.com/containerd/containerd/vendor/go.etcd.io/bbolt/db.go:670 +0x92
Dec 23 07:51:49 containerd[10083]: github.com/containerd/containerd/metadata.(*DB).Init(0xc4203a26c0, 0x2f6528ff040, 0xc42003c018, 0xc42050b640, 0xc4203a26c0)
Dec 23 07:51:49 containerd[10083]:         /go/src/github.com/containerd/containerd/metadata/db.go:103 +0xb1
Dec 23 07:51:49 containerd[10083]: github.com/containerd/containerd/services/server.LoadPlugins.func2(0xc42036aa80, 0x2f652080c31, 0xf, 0xc42024f060, 0x1e)
Dec 23 07:51:49 containerd[10083]:         /go/src/github.com/containerd/containerd/services/server/server.go:269 +0x53f
Dec 23 07:51:49 containerd[10083]: github.com/containerd/containerd/plugin.(*Registration).Init(0xc4203a6550, 0xc42036aa80, 0xc4203a6550)
Dec 23 07:51:49 containerd[10083]:         /go/src/github.com/containerd/containerd/plugin/plugin.go:100 +0x3a
Dec 23 07:51:49 containerd[10083]: github.com/containerd/containerd/services/server.New(0x2f6528ff040, 0xc42003c018, 0xc4200f2000, 0x1, 0xc42050bc80, 0x0)
Dec 23 07:51:49 containerd[10083]:         /go/src/github.com/containerd/containerd/services/server/server.go:120 +0x557
Dec 23 07:51:49 containerd[10083]: github.com/containerd/containerd/cmd/containerd/command.App.func1(0xc4200e8000, 0xc4200e8000, 0xc42050bd07)
Dec 23 07:51:49 containerd[10083]:         /go/src/github.com/containerd/containerd/cmd/containerd/command/main.go:141 +0x67e
Dec 23 07:51:49 containerd[10083]: github.com/containerd/containerd/vendor/github.com/urfave/cli.HandleAction(0x2f6526bff40, 0x2f6528d9360, 0xc4200e8000, 0xc4202d6120, 0x0)
Dec 23 07:51:49 containerd[10083]:         /go/src/github.com/containerd/containerd/vendor/github.com/urfave/cli/app.go:502 +0xca
Dec 23 07:51:49 containerd[10083]: github.com/containerd/containerd/vendor/github.com/urfave/cli.(*App).Run(0xc4202c8000, 0xc42003a1b0, 0x1, 0x1, 0x0, 0x0)
Dec 23 07:51:49 containerd[10083]:         /go/src/github.com/containerd/containerd/vendor/github.com/urfave/cli/app.go:268 +0x60e
Dec 23 07:51:49 containerd[10083]: main.main()
Dec 23 07:51:49 containerd[10083]:         github.com/containerd/containerd/cmd/containerd/main.go:33 +0x51
Dec 23 07:51:49 systemd[1]: containerd.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
Dec 23 07:51:49 systemd[1]: containerd.service: Unit entered failed state.
Dec 23 07:51:49 systemd[1]: containerd.service: Failed with result 'exit-code'.

Removing and reinstalling docker has no effect, its proper broken. Any way to disable the behavior via config?

@ghost
Copy link

ghost commented Dec 23, 2018

@sempervictus Just wanted to clarify, you experienced this running Arch on WSL with containers in that?

Also have you confirmed you can reproduce it in Arch on WSL as well? I've personally not considered trying different distros, because I've been fairly certain it's the compatibility layer on Windows' end.

@therealkenc
Copy link
Collaborator

Yes Occam says this is variation #3451 #902. It wasn't duped because there is no signature mmap() fail in the strace log provided in the OP. It hasn't been triaged (not by me anyway). Ref #2304.

@sempervictus
Copy link

sempervictus commented Dec 24, 2018

@petrosam: no, plain old Linux kernel, just using a filesystem originated in solaris.

Edit: in my experience golang has some issues with POSIX semantics, it makes assumptions about features like vdso, map hints, etc. ZFS on Linux has some differences from illumos and BSD in the memory management subsystem - Linux memory management is in itself somewhat different from the unix-ish OS'. So mmapped files in this context may well show errors you're seeing on other systems with not-exactly-linux semantics. I'd see how the BSD Linux containers run this, may happen there too.
Meantime, this is actually hitting a production NIDS in our environment and we've had to roll back to a November snapshot which doesn't have the newer docker-ce in it. Is there any magical incantation to disable the supposedly culprit behavior (make boltdb or its users access files in the manner they did before the change in boltdb)?

@therealkenc
Copy link
Collaborator

Edit: in my experience golang has some issues with POSIX semantics

That's possible. There's an additional challenge in that platform nuances can be worked-around in the source code with configuration knobs on other platforms like *BSD or MacOS (be it the filesystem or MM or IPC mechanisms or whatever). WSL pretends to be Real Linux, so the tiniest diverge in behavior (documented or otherwise) from the real thing can cause subtle problems. Constructive thing to do would be to try to convince someone familiar with the BoltDB (and golang) internals to take a look.

@Downchuck
Copy link

Downchuck commented Jan 31, 2019

Yeah I'm seeing this in the Ubuntu distribution on WSL as well.
I've filed an issue with the etc-d bbolt project etcd-io/bbolt#144

Retried in an admin terminal after cgroups mount and things spun up

sudo cgroupfs-mount
sudo dockerd -D

No luck with an image but at least it's a step forward

docker -H unix:///var/run/docker.sock run hello-world

@StackDoubleFlow
Copy link

This issue has been on wsl for a very long time. I'm really hope it can get fixed soon.

@therealkenc
Copy link
Collaborator

therealkenc commented May 14, 2019

This issue has been on wsl for a very long time

As an initial matter, this issue doesn't have a repro. "1. make sure GO is installed". Yes, someone probably could have looked up how to do that. Turns out someone didn't, and this went chirping crickets (along with many other issues without a repro).

Secondly, one year (May 2018) doesn't set any distance records.

But you made me look on a hunch...

$ mkdir -p $HOME/Downloads && cd Downloads
$ wget https://dl.google.com/go/go1.12.5.linux-amd64.tar.gz
$ tar xf go1.12.5.linux-amd64.tar.gz
$ sudo mv go /usr/local
$ export GOROOT=/usr/local/go
$ mkdir -p $HOME/goprojects
$ export GOPATH=$HOME/goprojects
$ export PATH=/usr/local/go/bin:$PATH
$ cd $GOPATH
$ go get github.com/boltdb/bolt
$ # put OP test case in main.go; although that should be a wget CLI too
$ go build main.go
$ cat /proc/version
Linux version 4.4.0-18890-Microsoft (Microsoft@Microsoft.com) (gcc version 5.4.0 (GCC) ) #1000-Microsoft Fri Apr 26 16:18:00 PST 2019
$ ./main
$ # No errors. 
$ # If you comment out the os.Remove("test.db")  [that's probably a thing] and recompile
$ go build main.go
$ ./main
$ ls -l test.db
-rw------- 1 ken ken 32768 May 13 21:12 test.db

Calling this #3939 addressed in 18890.

@rfikki
Copy link

rfikki commented May 15, 2019

$ cd /home/rfikki/downloads
$ wget https://dl.google.com/go/go1.12.5.linux-amd64.tar.gz
Will not apply HSTS. The HSTS database must be a regular and non-world-writable file.
ERROR: could not open HSTS store at '/home/rfikki/.wget-hsts'. HSTS will be disabled.
--2019-05-15 07:40:39-- https://dl.google.com/go/go1.12.5.linux-amd64.tar.gz
Resolving dl.google.com (dl.google.com)... 2607:f8b0:4009:805::200e, 172.217.0.14
Connecting to dl.google.com (dl.google.com)|2607:f8b0:4009:805::200e|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 127938445 (122M) [application/octet-stream]
Saving to: ‘go1.12.5.linux-amd64.tar.gz’

go1.12.5.linux-amd64.tar.gz 100%[=================================================>] 122.01M 32.2MB/s in 4.1s

2019-05-15 07:40:43 (30.1 MB/s) - ‘go1.12.5.linux-amd64.tar.gz’ saved [127938445/127938445]

$ ls
go1.12.5.linux-amd64.tar.gz
$ tar xf go1.12.5.linux-amd64.tar.gz
$ sudo mv go /usr/local
[sudo] password for rfikki:
$ export GOROOT=/usr/local/go
$ mkdir -p $HOME/goprojects
$ export GOPATH=$HOME/goprojects
$ export PATH=/usr/local/go/bin:$PATH
$ cd $GOPATH
$ go get github.com/boltdb/bolt
$ cat /proc/version
Linux version 4.4.0-17134-Microsoft (Microsoft@Microsoft.com) (gcc version 5.4.0 (GCC) ) #706-Microsoft Mon Apr 01 18:13:00 PST 2019
$ ./main
-bash: ./main: No such file or directory
$ go build main.go
stat main.go: no such file or directory
$ cd $GOPATH
$ cat /proc/version
Linux version 4.4.0-17134-Microsoft (Microsoft@Microsoft.com) (gcc version 5.4.0 (GCC) ) #706-Microsoft Mon Apr 01 18:13:00 PST 2019
$ joe maain.go
^K I A maain.go (Modified)(go) Row 38 Col 1

func main() {
os.Remove("test.db")
db, err := bolt.Open("test.db", 0600, nil)
if err != nil {
log.Fatal(err)
}
defer db.Close()

    db.Update(func(tx *bolt.Tx) error {
            _, err := tx.CreateBucket([]byte("MyBucket"))
            if err != nil {
                    return fmt.Errorf("create bucket: %s", err)
            }
            return err
    })

    db.View(func(tx *bolt.Tx) error {
            b := tx.Bucket([]byte("MyBucket"))

            c := b.Cursor()

            c.Seek([]byte("test"))

            return nil
    })
    os.Remove("test.db")

}

File maain.go saved
$ cp maain.go main.go
$ ls
maain.go main.go pkg src
$ go build main.go
$ ./main
panic: invalid page type: 0: 4

goroutine 1 [running]:
github.com/boltdb/bolt.(*Cursor).search(0xc00009ae20, 0xc00009aea8, 0x8, 0x8, 0x4)
/home/rfikki/goprojects/src/github.com/boltdb/bolt/cursor.go:256 +0x354
github.com/boltdb/bolt.(*Cursor).seek(0xc00009ae20, 0xc00009aea8, 0x8, 0x8, 0x4cb6a0, 0x7f2f206d0001, 0xc000018170, 0xc00009ae38, 0x43c892, 0xc00000e040, ...)
/home/rfikki/goprojects/src/github.com/boltdb/bolt/cursor.go:159 +0x7e
github.com/boltdb/bolt.(*Bucket).Bucket(0xc0000a00f8, 0xc00009aea8, 0x8, 0x8, 0xc00000e040)
/home/rfikki/goprojects/src/github.com/boltdb/bolt/bucket.go:112 +0xd5
github.com/boltdb/bolt.(*Tx).Bucket(...)
/home/rfikki/goprojects/src/github.com/boltdb/bolt/tx.go:101
main.main.func2(0xc0000a00e0, 0x4ecc50, 0xc0000a00e0)
/home/rfikki/goprojects/main.go:28 +0x72
github.com/boltdb/bolt.(*DB).View(0xc00000a1e0, 0x4ecc80, 0x0, 0x0)
/home/rfikki/goprojects/src/github.com/boltdb/bolt/db.go:629 +0x90
main.main()
/home/rfikki/goprojects/main.go:27 +0x10f

@therealkenc
Copy link
Collaborator

Linux version 4.4.0-17134-

Well, yeah.

I have pruned your cross post.

@tomchiverton
Copy link

tomchiverton commented Mar 3, 2020

The release version of Docker does not run under WSL on Ubuntu 18.04 because of this bug.

Is there any eta anyone knows from Microsoft of when the bug fix will be in the release version of Windows ?

@krader1961
Copy link

I'm seeing the same failure using the Elvish shell which uses BoltDb. My Windows 10 VM is up to date. I don't see any reason to think https://docs.microsoft.com/en-us/windows/wsl/release-notes#build-18890 has resolved this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants