Discussion: Storage engine options and defaults #37

patrickmn · 2017-06-07T18:43:24Z

Constellation currently uses BerkeleyDB for all storage. It includes code for LevelDB and SQLite, however one cannot currently choose either of them. The simple reason BerkeleyDB is the default is that it was faster than the other options in our testing.

Constellation was specifically designed to use stateless crypto--XSalsa20 allows for randomly generated nonces--in order to support hosting the same key pair on multiple Constellation nodes and using a shared underlying datastore like S3 without requiring contentious nonce management, and positing only the restriction that the data store must have read-after-creation consistency. With S3 and similar, thinking about redundancy and backups becomes a lot simpler, and since the payloads stored are encrypted, storing them with a cloud provider doesn't involve much risk.

Ideally, the --storage option will work as follows:

constellation-node --storage=data -- use default engine (BerkeleyDB) in the folder 'data'
constellation-node --storage=bdb:data -- explicitly use BerkeleyDB in the folder 'data'
constellation-node --storage=s3:constellationstore -- use the 'constellationstore' bucket on S3 (credentials fetched from ~/.aws/credentials or env vars on startup)
...

Now:

What should the out of the box default be? Should it continue to be BerkeleyDB? Other options include BoltDB, LevelDB, RocksDB, ...
What other options should be supported? For example:
- S3
- Google Cloud DataStore (10MiB object limit) or Google Blobstore
- Azure Blob Storage
- Redis
- Tahoe-LAFS
- seaweedfs
- ...

(These would be the options supported out of the box in the standalone version, but you would still be able to import Constellation as a library and supply anything that satisfies the Storage datatype for exotic requirements.)

The text was updated successfully, but these errors were encountered:

patrickmn · 2017-06-07T22:44:55Z

After writing this, I realized a useful escape hatch is a directory storage engine that simply creates a file for each payload. That way, you can use any FUSE adapter you want (as long as it has read-after-create consistency.)

PR here: #38

zookozcash · 2017-06-08T22:56:52Z

Coincidentally, in the Zcash project we're in the process of migrating off of BerkeleyDB (which we suspect of being unreliable) to some alternative, potentially sqlite. Here is our discussion of that: zcash/zcash#2221

If you could give us numbers from "BerkeleyDB was faster than the other options in our testing", that would be useful.

patrickmn · 2017-06-09T02:19:03Z

Funny coincidence! SQLite does seem like a very solid choice/default. It's crazy how reliable it is.

Unfortunately I don't have the numbers anymore, but I'll rerun some benchmarks on the different ones we have implemented (maildir-style, leveldb, sqlite, bdb.)

The main caveat with SQLite IIRC is concurrent write contention. Constellation is typically (at least as used in in Quorum) around 50/50 read/write.

patrickmn · 2017-06-09T02:32:50Z

@zookozcash also, we are having issues with the most recent bdb API/symbols changing: #28

camswx · 2017-06-13T17:30:28Z

@patrickmn If you create a file for each payload, would something like IPFS/Ethereum Swarm/Sia/StorJ become an option for distributed storage?

tjayrush · 2017-06-15T12:59:25Z

Hi. My name is Jay Rush. I was asked by Andy Tudhope in the Consensys topic-dev-practices slack to check this conversation out because Andy had seen my work with QuickBlocks (http://quickblocks.io -- the website is seriously out of date).

The goal of QuickBlocks is two fold: first: fast delivery of EVM data, but second: fast deliver of better than EVM data. QuickBlocks parses the EVM data and returns it to the 'language of the originating smart contract,' and then it caches that data for significantly faster delivery than we see from the RPC interface for example.

I'm not exactly sure how I can help, but I'm very interested in what you're discussing. I'll probably mostly lurk and listen, but I will speak up if I see somewhere where I can add to the conversation.

conor10 · 2017-06-17T05:03:08Z

@patrickmn - some initial thoughts:

A default of SQLite sounds sensible if the performance is up to scratch. Reading the linked issue and potential loss of funds caused by BDB is scary stuff. Can’t afford to see that happen in Constellation either.

There’s definitely value in supporting AWS S3/Azure Blobstore/Google Cloud Datastore/Redis, however, is anyone requesting them at this point? You’re probably fine with the directory approach for now (will review code shortly). Further integrations could end up being time-consuming to support.

Likewise, I don’t think it’s necessary to support additional in-memory stores at this stage (beyond what’s already there). Chances are if someone is deploying Quorum in a HA environment they’re going to want to throw one of the cloud data stores into the mix, rather then need further in-memory store options.

The storage option semantics make sense. It would be good to allow users to make use of the implementations that have already been written in Constellation 😃.

tylobban · 2017-06-21T21:31:19Z

(@tjayrush btw I reached out via mail but may have gone to spam..)

tjayrush · 2017-06-21T22:32:38Z

I'm sorry. I missed that. Try jrush@quickblocks.io.

…

-------- Thomas Jay Rush jrush@greathill.com

On Jun 21, 2017, at 5:31 PM, Tyrone Lobban ***@***.***> wrote: ***@***.*** btw I reached out via mail but may have gone to spam..) — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

cartazio · 2018-04-17T22:45:55Z

closign this issue for now because the solution is going to be: sqlite support online, as tracked in another ticket,

patrickmn added enhancement help wanted labels Jun 7, 2017

patrickmn mentioned this issue Jun 8, 2017

Discussion: Constellation storage engines and defaults Consensys/quorum#128

Closed

patrickmn mentioned this issue Jun 9, 2017

Evaluate sqlite as a replacement for BDB zcash/zcash#2221

Open

patrickmn mentioned this issue Jun 22, 2017

Node.Storage.*: Add sqlite, leveldb, and memory storage engines #42

Merged

patrickmn mentioned this issue Jul 1, 2017

Doesn't work with latest Homebrew version of berkeley-db #28

Closed

Lsquared13 mentioned this issue Dec 29, 2017

Replacing quorum node loses private transaction data Eximchain/terraform-aws-quorum-cluster#1

Closed

patrickmn mentioned this issue Apr 5, 2018

remove non-sqlite storage backends #80

Closed

cartazio closed this as completed Apr 17, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discussion: Storage engine options and defaults #37

Discussion: Storage engine options and defaults #37

patrickmn commented Jun 7, 2017

patrickmn commented Jun 7, 2017

zookozcash commented Jun 8, 2017

patrickmn commented Jun 9, 2017

patrickmn commented Jun 9, 2017

camswx commented Jun 13, 2017

tjayrush commented Jun 15, 2017

conor10 commented Jun 17, 2017

tylobban commented Jun 21, 2017

tjayrush commented Jun 21, 2017 via email

cartazio commented Apr 17, 2018

Discussion: Storage engine options and defaults #37

Discussion: Storage engine options and defaults #37

Comments

patrickmn commented Jun 7, 2017

patrickmn commented Jun 7, 2017

zookozcash commented Jun 8, 2017

patrickmn commented Jun 9, 2017

patrickmn commented Jun 9, 2017

camswx commented Jun 13, 2017

tjayrush commented Jun 15, 2017

conor10 commented Jun 17, 2017

tylobban commented Jun 21, 2017

tjayrush commented Jun 21, 2017 via email

cartazio commented Apr 17, 2018