Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What are the plans for SQLeet? #39

Open
johnfound opened this issue Oct 28, 2020 · 17 comments
Open

What are the plans for SQLeet? #39

johnfound opened this issue Oct 28, 2020 · 17 comments

Comments

@johnfound
Copy link

Is this project still alive? Are there plans to accept the new API of SQLite and to allow updates beyond v3.32 of the engine?

I am using SQLeet in my project and it is good to know the plans of the project. It is not an urgent issue, but stalling the version of SQLite to 3.32 is serious reason to look for a replacement.

On the other hand, SQLeet is the most compact encryption solution I was able to found, so I will look for replacement with а great regret.

@resilar
Copy link
Owner

resilar commented Oct 29, 2020

Yes, there are plans to continue development of sqleet.

The problem is that SQLite after version 3.32 dropped codec encryption support that sqleet uses, so most of the project needs to be rewritten as a VFS plugin. I'm working on this, but I have been recently busy with my real life job. You can expect a new improved version of sqleet maybe sometime in January. If it is not finished in late January, I'll at least push a somewhat-usable preliminary development branch to GitHub.

Meanwhile, I recommend considering using wxSQLite3 if you need the latest version of SQLite3. It has already been rewritten to use the VFS plugin interface instead of the old deprecated codec interface.

Sorry for the delay. I just want the VFS-plugin rewrite to be at least the same quality as the current codec implementation, and there's some big hurdles to avoid some nasty security and performance issues.

@utelle
Copy link
Contributor

utelle commented Oct 29, 2020

Meanwhile, I recommend considering using wxSQLite3 if you need the latest version of SQLite3. It has already been rewritten to use the VFS plugin interface instead of the old deprecated codec interface.

Just for your information: the SQLite encryption extension of wxSQLite3 has been separated into its own GitHub project SQLite3 Multiple Ciphers. The latest SQLite version is supported based on a VFS implementation of the encryption extension. Various encryption schemes are supported: wxSQLite3, sqleet, SQLCipher, and System.Data.SQLite.

In case the announced VFS implementation of @resilar should reveal flaws in my VFS implementation I will gladly adjust it. Up to now I haven't heard of any major bugs in my implementation.

@johnfound
Copy link
Author

Nice to know the project is not abandoned.

As a rule, the SQLite update is never an urgent task. Because the old versions are stable and almost bug-free.

For me it is only important to know, that in the foreseeable future, there will be an update. Thanks for your work.

@brodycj
Copy link

brodycj commented Dec 21, 2020

FYI there is a nice architectural diagram of how ZIPVFS, Pager, and OS interface work here: https://www.sqlite.org/zipvfs/doc/trunk/www/howitworks.wiki

I am definitely looking forward to the "VFS-plugin rewrite" assuming it happens.

@brodycj
Copy link

brodycj commented Dec 21, 2020

I really hope the new VFS implementation can move forward in the near future, even if it is in rough and breaking form. If this can be posted as a gist, I think it would help many people. I have a customer who has been suffering from performance issues between SQLCipher and FTS on Android, and SQLeet seems to overcome these performance issues. I did try the SQLite3MultipleCiphers project as well and it does seem OK through a bit slower than SQLeet when SELECTing data with the MATCH patterns.

@utelle
Copy link
Contributor

utelle commented Dec 22, 2020

I really hope the new VFS implementation can move forward in the near future, even if it is in rough and breaking form.

A VFS implementation should not break anything. It is simply acting at another processing stage than the previous SQLite codec API. While the SQLite codec API solely operated on page data in memory, after the default VFS had fetched them from disk or before the default VFS stored them to disk, an encryption VFS acts as a VFS shim intercepting the calls to the underlying default VFS. BTW, the situation for an encryption VFS is a lot easier than for the ZIPVFS you mentioned in your previous post, because the page size stays fixed - that is, no need for a second level pager.

By now the VFS implementation of SQLite3MultipleCiphers seems to be in a rather stable state - there were a few issues in conjunction with setting or changing the encryption key (all resolved in the meantime).

I'm also looking forward to the VFS implementation of @resilar to learn about the actual or supposed flaws in the VFS implementation of SQLite3MultipleCiphers.

I have a customer who has been suffering from performance issues between SQLCipher and FTS on Android, and SQLeet seems to overcome these performance issues.

SQLCipher uses AES-256+SHA256 encryption in its latest version, while SQLeet uses ChaCha20-Poly1305 encryption. The latter is inherently faster than AES encryption. However, SQLite performance heavily depends also on cache size and other factors.

I did try the SQLite3MultipleCiphers project as well and it does seem OK though a bit slower than SQLeet when SELECTing data with the MATCH patterns.

SQLite3MultipleCiphers uses as default the same cipher scheme as SQLeet, and the underlying implementation of the crypto algorithms is exactly the same as that of SQLeet. Of course, the option of being able to select the cipher scheme at runtime incurs a small runtime penalty, because there is one more level of indirection. However, compared to the computation effort required for encrypting or decrypting a database page this additional level of indirection should be neglectable. So, how much slower is a bit slower? Changes in performance for FTS could also be linked to changes in the FTS implementation, of which there were plenty since the beginning of this year.

@brodycj
Copy link

brodycj commented Dec 22, 2020

Thanks @utelle. Unfortunately I do not have any hard data about SQLeet vs SQLite3MultipleCiphers, my testing was done in an experimental software stack with some private customer data. I am also wondering if SQLeet and/or SQLite3MultipleCiphers will be affected by the "sqleet v1 cryptosystem" discussed in issue #35.

Assuming that things will change with the "sqleet v1 cryptosystem", I think it would be ideal if SQLeet and SQLite3MultipleCiphers would be 100% compatible by default when that happens.

I am maintaining software that provides SQLite storage capability for the Apache Cordova mobile framework (cross-platform mobile apps in JavaScript with some native plugins). SQLeet seems to meet the needs of my customer; I am starting to wonder if some others wouls be better off with SQLeet or SQLite3MultipleCiphers.

@utelle
Copy link
Contributor

utelle commented Dec 22, 2020

Unfortunately I do not have any hard data about SQLeet vs SQLite3MultipleCiphers, my testing was done in an experimental software stack with some private customer data.

As long as SQLeet has not adopted the latest SQLite version comparing performance results between SQLeet and SQLite3MultipleCiphers will be difficult, because it will not be clear which effects are related to SQLite itself and which to the encryption extensions.

Nevertheless I'm of course interested in hearing about any serious performance problems related to the encryption extension. Only issues I hear of can be possibly fixed.

I am also wondering if SQLeet and/or SQLite3MultipleCiphers will be affected by the "sqleet v1 cryptosystem" discussed in issue #35.

I'm not a certified crypto expert, but I'm open to participate in discussions and to drop in my 2ct of knowledge. If and when "sqleet v1 cryptosystem" will be implemented, it will introduce a new SQLite encryption cipher which will not be compatible with prior versions. For SQLite3MultipleCiphers this will simply require to implement an additional cipher.

Assuming that things will change with the "sqleet v1 cryptosystem", I think it would be ideal if SQLeet and SQLite3MultipleCiphers would be 100% compatible by default when that happens.

SQLite3MultipleCiphers currently offers a SQLeet compatible cipher mode, and I intend to keep it that way.

I am maintaining software that provides SQLite storage capability for the Apache Cordova mobile framework (cross-platform mobile apps in JavaScript with some native plugins). SQLeet seems to meet the needs of my customer; I am starting to wonder if some others would be better off with SQLeet or SQLite3MultipleCiphers.

Well, the SQLeet cipher mode of SQLite3MultipleCiphers should behave identically to the original SQLeet. That is, in an ideal world the user should not experience any noteworthy differences, since the encryption process itself is transparent for the user anyway.

The main difference is that SQLeet offers a single cipher mode, while SQLite3MultipleCiphers offers multiple cipher modes. If an application just needs and uses a single cipher mode, choosing SQLeet may certainly be the right thing to do. The motivation to implement SQLite3MultipleCiphers was to establish compatibility with legacy applications which were bound to older cipher modes, without requiring to link new applications with several different SQLite encryption extensions (which can be a difficult task to accomplish).

@resilar
Copy link
Owner

resilar commented Dec 25, 2020

I really hope the new VFS implementation can move forward in the near future, even if it is in rough and breaking form. If this can be posted as a gist, I think it would help many people.

Working on it. I will most definitely have something usable ready to publish in mid to late January.

Some great progress has been made. It seems possible to rewrite sqleet as a backwards-compatible VFS extension using only the public VFS interface (that is, without depending on SQLite3 private functions or any internals whatsoever). Most notably, rekeying a database file and controlling the page size and reserved bytes fields in the database header can be accomplished with a little VFS trickery. This means we can finally get rid of rekeyvacuum.c and compile sqleet VFS extension against sqlite3.h and sqlite3ext.h headers without requiring full (amalgamated) SQLite3 source code.

However, I am currently considering breaking backwards compatibility with codec-based sqleet journal and WAL files in order to encrypt them fully and simplify the VFS implementation. Let's hope this does not cause major headache to users (it should not because having hot journals/WALs lying around is the exception to the rule). The idea is to encrypt journal and WAL metadata against a single ChaCha20 stream without authentication, but still encrypt and authenticate database page contents using the file offset in place of page number to derive the ChaCha20/Poly1305 one-time keys. This improves security and simplifies the implementation by removing the need to track page numbers when writing to or reading from encrypted journal/WAL files.

I have a customer who has been suffering from performance issues between SQLCipher and FTS on Android, and SQLeet seems to overcome these performance issues. I did try the SQLite3MultipleCiphers project as well and it does seem OK through a bit slower than SQLeet when SELECTing data with the MATCH patterns.

I'd expect sqleet to be little faster than SQLCipher on Android but, as Utelle mentioned, there should not be significant difference between sqleet and default-configured SQLite3MC since both use the same crypto primitives with identical C implementations. The performance hit of the indirection introduced by the VFS layer should be negligible in my estimation. Are you sure SQLite3MC is compiled with the same optimizations enabled? Concurrency might be another reason for the discrepancies if your application is heavily multi-threaded because locking mechanisms of sqleet, SQLCipher and SQLite3MC are not exactly identical AFAIK.

Assuming that things will change with the "sqleet v1 cryptosystem", I think it would be ideal if SQLeet and SQLite3MultipleCiphers would be 100% compatible by default when that happens.

The proposed sqleet v1 cryptosystem was designed prior to the sudden deprecation of the SQLite3 codec API and should probably be redesigned to fully exploit the new VFS-based architecture, which is not going to happen anytime soon (in late 2021 if lucky). Before that sqleet and SQLite3MC will definitely be compatible with each other. Perhaps the only exception being journal and WAL files.

I am starting to wonder if some others wouls be better off with SQLeet or SQLite3MultipleCiphers.

Switching from one to the other should not be too much hassle because the encrypted databases are compatible. If the latest SQLite version (3.31.1) supported by sqleet is too old and you cannot wait a couple of months for production-ready VFS-based implementation of sqleet, then the best option is to switch to SQLite3MC (you can always switch back later if you do not like it as much). SQLite3MC is also the way to go if support for SQLCipher-encrypted databases is a useful feature for you.

@utelle
Copy link
Contributor

utelle commented Dec 25, 2020

Some great progress has been made. It seems possible to rewrite sqleet as a backwards-compatible VFS extension using only the public VFS interface (that is, without depending on SQLite3 private functions or any internals whatsoever).

Of course it would be ideal to be independent of SQLite3's private/internal functions. However, in the SQLite3MC implementation a few internals of SQLite3 are used in the end, although at first I had also intended to get away without accessing any internals.

For example, pragma handling for own additional pragma commands is difficult, because pragmas are forwarded to a VFS only, if there actually is an associated database file (and that's not the case if there is only a transient database open - like in the SQLite3 shell, if no database file name is given on the command line). If there is no associated database file pragmas are silently ignored.

Most notably, rekeying a database file and controlling the page size and reserved bytes fields in the database header can be accomplished with a little VFS trickery. This means we can finally get rid of rekeyvacuum.c and compile sqleet VFS extension against sqlite3.h and sqlite3ext.h headers without requiring full (amalgamated) SQLite3 source code.

I'm really looking forward to take a look at your implementation, especially, how you feedback to SQLite3 information about changed page size and/or changed number of reserved bytes per page - in fact, the btree and pager instances need to know these things.

However, I am currently considering breaking backwards compatibility with codec-based sqleet journal and WAL files in order to encrypt them fully and simplify the VFS implementation.

With the SQLite3 Codec API only page data are encrypted. In the SQLite3MC implementation this is also the case. Nevertheless, at least for WAL the implementation is not backwards compatible. The reason is simple: the WAL mode calculates certain checksums that are written to file in the meta data. With the codec API the checksums are calculated for already encrypted data, while with the VFS implementation they are calculated for the still unencrypted data.

Let's hope this does not cause major headache to users (it should not because having hot journals/WALs lying around is the exception to the rule).

Existing normal journal files should not impose a problem, but existing WAL files (created by codec-based version) could cause trouble. Most likely it should be recommended to stop WAL mode operation before making the transition to the VFS implementation, and to reenable WAL mode later on.

The idea is to encrypt journal and WAL metadata against a single ChaCha20 stream without authentication, but still encrypt and authenticate database page contents using the file offset in place of page number to derive the ChaCha20/Poly1305 one-time keys. This improves security and simplifies the implementation by removing the need to track page numbers when writing to or reading from encrypted journal/WAL files.

Encrypting the metadata, too, would certainly be a nice feature. However, it may not be trivial to implement it for WAL mode, because WAL metadata are often accessed and processed only partially.

@brodycj
Copy link

brodycj commented Jan 11, 2021

I have also been wondering if this library has been reviewed for security. Should I ask this in a new issue?

@resilar
Copy link
Owner

resilar commented Jan 12, 2021

I have also been wondering if this library has been reviewed for security. Should I ask this in a new issue?

The current cryptosystem of sqleet is a minor simplification of the ChaCha20-Poly1305 AEAD construction applied in TLSv1.3. RFC7905 and RFC7539 discuss the security properties of ChaCha20-Poly1305 AEAD in the context of TLSv1.3. IIRC there also exists a paper which provides a formal security proof of ChaCha20-Poly1305 AEAD but I cannot find it right now,

In sqleet, TLSv1.3's ChaCha20-Poly1305 AEAD has been modified to make it more suitable for a database page based encryption. Since we encrypt only PAGE_SIZE-byte (4096 by default) messages, it makes no sense to include the length information in encrypted messages. We also drop the "associated data" part of the AEAD construction because we do not need it either. The final change is in the Poly1305 one-time key (OTK) derivation part of RFC7905 Secrtion 2.6.1. Instead of using only the lower 32-bytes as a Poly1305 OTK, sqleet also utilizes the (unused) upper 32-bytes as a one-time ChaCha20 key. This improves the security considerably because each database page is encrypted against a different ChaCha20 stream (in comparison, TLSv1.3 uses a single contiguous ChaCha20 stream to encrypt all traffic).

Nevertheless, a formal security proof of sqleet would be nice. But that's not a high priority right now since sqleet v1 cryptosystem is in the planning. I believe we can make sqleet more efficient and more secure against certain theoretical EM side channel attacks, and maybe more versatile to optionally allow TPM/TrustZone based encryption that stores the master key in secure hardware instead of user-process memory.

@resilar
Copy link
Owner

resilar commented Jan 19, 2021

I'm afraid the VFS-extension rewrite will be delayed for a few weeks. ETA: February

UPDATE ~2 years later: Well, didn't happen lol.

@m4heshd
Copy link

m4heshd commented Feb 14, 2021

@resilar Your hard work is much appreciated and thank you for not giving up. Same goes for @utelle 🙏🏻

@danshearer
Copy link

This was really interesting to find. I had not heard of SQLeet, and many of the problems SQLeet has are the same ones we have at LumoSQL, see https://lumosql.org/src/lumosql/doc/trunk/README.md . LumoSQL adds privacy and security features to SQLite, without forking the original. We're working on our next version, and feel free to look at our code just like we're looking at yours :-)

We are working on page-based encryption as well as row-based, and we are avoiding the use of a VFS at present. That's what SQLCipher have concluded as well, however, @resilar you make some good points.

Best,

Dan Shearer

@chliang
Copy link

chliang commented Dec 31, 2022

Are you all right? No news on github for a whole year!@resilar

@resilar
Copy link
Owner

resilar commented Apr 28, 2023

Sorry for keeping you guys hanging. Anyways, I still have plans to resurrect sqleet as an VFS extension at some time in the indefinite future. Please do not wait too much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants