Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: Would it be possible to make the key salt accessible? #13

Open
utelle opened this issue Feb 24, 2018 · 16 comments
Open

Question: Would it be possible to make the key salt accessible? #13

utelle opened this issue Feb 24, 2018 · 16 comments

Comments

@utelle
Copy link
Contributor

utelle commented Feb 24, 2018

While I analyzed the encryption scheme of SQLCipher, I found out that SQLCipher stores the key salt in the first 16 bytes of the database header just as sqleet. However, SQLCipher does not include the key salt in the HMAC calculation of database page 1, while sqleet does.

In the end it seems to be irrelevant whether the key salt is included in the HMAC calculation or not. If the key salt bytes in the first 16 bytes of the database file are modified - either accidentally or on purpose -, this will make the database file inaccessible.

The problem I see is that there is almost no chance to recover a database corrupted in such a way, since the user only knows the passphrase he used, but he does not know the random key salt.

IMHO it would make sense to implement an option to access the generated key salt, so that it could be stored in a safe place, and/or to let the user set a specific key salt instead of generating a random one.

This would allow to create a tool to recover a corrupted encrypted database - at least partially.

@resilar resilar closed this as completed Oct 22, 2018
@resilar resilar reopened this Oct 22, 2018
@resilar
Copy link
Owner

resilar commented Oct 22, 2018

Sorry about the accidental close...

My initial reaction was that adding a C API for accessing the salt would be unjustified because it is so rare operation. Moreover, competent people with hex editors can already access and modify the salt quite easily because it is stored, cleartext, in the first 16 bytes of the database file. Average users should not have to care about salts at all (except for maybe fixing data corruption -- see my response to those issues in the end of this comment).

However, I have started to think that we need to have some control of the salt but for another reason. SQLCipher (and other libraries too?) support specifying a raw key which bypasses the key derivation function. I like the idea because it allows program developers, who use sqleet as a library, to take full responsibility of the key derivation process. This would address the problem that sqleet's PBKDF2-HMAC-SHA256 is far from ideal, especially since I want to keep the configuration of KDF parameters strictly compile-time only (for religious design reasons).

Raw key support has the consequence that sqleet no longer needs the salt (when raw keys are being used), whereas the external program handling the key derivation might do. Hence, I have been considering something like this:

diff --git a/README.md b/README.md
index d8cd6af..5fa7c53 100644
--- a/README.md
+++ b/README.md
@@ -135,6 +135,15 @@ In addition, there are `sqlite3_key_v2()` and `sqlite3_rekey_v2()` functions
 that accept the target database name as the second parameter. By default, the
 main database is used.
 
+The above functions pass the provided key string (password) to a key derivation
+algorithm (i.e., PBKDF2-HMAC-SHA256 with a 16-byte salt and 12345 iterations).
+Optionally, the user can bypass the key derivation by specifying a raw key in
+format `raw:X` where `X` is a 32-byte binary string or a 64-digit hex-encoded
+string. This can be useful in programs that use sqleet as a library and want to
+handle key derivation by themselves. Additionally, the raw key string can also
+be followed by a 16-byte (or 32-hexdigit) salt which is stored in the beginning
+of the database file.
+
 
 Android support
 ---------------
diff --git a/sqleet.c b/sqleet.c
index 06e6561..3695a83 100644
--- a/sqleet.c
+++ b/sqleet.c
@@ -42,10 +42,62 @@ Codec *codec_dup(Codec *src)
     return codec;
 }
 
+static int hex_decode(const char *hex, unsigned int n, char *out)
+{
+    int i;
+    for (i = 0; i < n; i++) {
+        char c = hex[i];
+        if (c >= '0' && c <= '9') {
+            c = c - '0';
+        } else if (c >= 'A' && c <= 'F') {
+            c = c - 'A' + 10;
+        } else if (c >= 'a' && c <= 'f') {
+            c = c - 'a' + 10;
+        } else {
+            int j = (i+1) / 2;
+            for (i = 0; i < j; i++) {
+                ((volatile char *)out)[i] = 0;
+            }
+            return 0;
+        }
+        out[i/2] = (out[i/2] << 4) | c;
+    }
+    return 1;
+}
+
 void codec_kdf(Codec *codec)
 {
+    /* Skip key derivation if the given key starts with "raw:" */
+    if (codec->nKey > 4 && !memcmp(codec->zKey, "raw:", 4)) {
+        switch (codec->nKey) {
+        /* Binary key (and salt) */
+        case 4 + 32 + 16:
+            memcpy(codec->salt, (const char*)codec->zKey + 4 + 32, 16);
+            /* fall-through */
+        case 4 + 32:
+            memcpy(codec->key, (const char*)codec->zKey + 4, 32);
+            goto done;
+
+        /* Hex-encoded key (and salt) */
+        case 4 + 64 + 32:
+            if (!hex_decode((const char*)codec->zKey + 4 + 64, 32, codec->salt))
+                break;
+            /* fall-through */
+        case 4 + 64:
+            if (!hex_decode((const char*)codec->zKey + 4, 64, codec->key))
+                break;
+            goto done;
+
+        default:
+            break;
+        }
+    }
+
+    /* Run key-derivation algorithm on the key string */
     pbkdf2_hmac_sha256(codec->zKey, codec->nKey, codec->salt, 16, 12345,
                        codec->key, 32);
+
+done:
     codec->zKey = NULL;
     codec->nKey = 0;
 }

This patch would allow specifying a raw key and, optionally, a salt which becomes the first 16 bytes of the database (no sqleet API for reading it though). The proposed patch does not specially support key derivation from a passphrase with a specified salt. However, this would still offer at least a partial solution to the problem: the user could perform KDF by himself using PBKDF2-HMAC-SHA256 with 12345 iterations & any salt he wants, and then access the database with the produced raw key (although this fails because of Poly1305 MAC error if the database salt is corrupted).

Finally,

The problem I see is that there is almost no chance to recover a database corrupted in such a way, since the user only knows the passphrase he used, but he does not know the random key salt.

IMHO it would make sense to implement an option to access the generated key salt, so that it could be stored in a safe place, and/or to let the user set a specific key salt instead of generating a random one.

This would allow to create a tool to recover a corrupted encrypted database - at least partially.

Data corruption (due to hardware failure or whatever) is a major issue and needs to be addressed, but IMO preferably with special tooling instead of tackling potential issues at library-level in sqleet. I think the way to go is to write a bunch of Python scripts under scripts/ directory in the repository, which handle, e.g., key derivation, full encrypts and decrypts, potential version migrations, as well as dumping and manipulating internal information such as the database salt, nonces and Poly1305 tags. Most importantly, the scripts should support something like forcibly decrypting a given database ignoring any MAC errors etc.

Scripts are the next thing on my TODO list after the raw key support ...

TLDR Raw key support will help a bit in the near future, but external scripts are needed for proper data rescue and salt manipulation.

@utelle
Copy link
Contributor Author

utelle commented Oct 23, 2018

My initial reaction was that adding a C API for accessing the salt would be unjustified because it is so rare operation.

It might be a rare operation, because in fact it would be only necessary on creating a new database, for example to be able to save the salt in a safe place for recovery in case of database corruption.

Moreover, competent people with hex editors can already access and modify the salt quite easily because it is stored, cleartext, in the first 16 bytes of the database file.

And that's exactly why I find it at least useful to offer a safe API method instead of forcing a user to use a hex editor with which one can easily corrupt the database accidentally. A single modified salt bit is enough to corrupt the database and make it inaccessible.

Average users should not have to care about salts at all (except for maybe fixing data corruption -- see my response to those issues in the end of this comment).

A user knows the key (password phrase) per definition, but he doesn't know the random salt. That is, in case of a corrupted salt the user has almost no chance to recover his database.

However, I have started to think that we need to have some control of the salt but for another reason. SQLCipher (and other libraries too?) support specifying a raw key which bypasses the key derivation function.

SQLCipher definitely allows to specify key and/or salt as a raw key without using a key derivation function. I don't know about other libraries.

I like the idea because it allows program developers, who use sqleet as a library, to take full responsibility of the key derivation process. This would address the problem that sqleet's PBKDF2-HMAC-SHA256 is far from ideal, especially since I want to keep the configuration of KDF parameters strictly compile-time only (for religious design reasons).

The reasoning is somewhat illogical. On the one hand you want to forbid that a user changes configuration parameters at runtime, on the other hand you want to allow complete freedom at runtime. Why not allow changing configuration parameters at runtime as a compromise?

Raw key support has the consequence that sqleet no longer needs the salt (when raw keys are being used), whereas the external program handling the key derivation might do. Hence, I have been considering something like this: [...]

Looks ok for me.

Data corruption (due to hardware failure or whatever) is a major issue and needs to be addressed,

Right. And exactly for this purpose you will need the salt, because ...

but IMO preferably with special tooling instead of tackling potential issues at library-level in sqleet.

... whatever tools you develop, without an intact salt you are fighting a lost battle, if the salt was affected by the data corruption.

I think the way to go is to write a bunch of Python scripts under scripts/ directory in the repository, which handle, e.g., key derivation, full encrypts and decrypts, potential version migrations, as well as dumping and manipulating internal information such as the database salt, nonces and Poly1305 tags. Most importantly, the scripts should support something like forcibly decrypting a given database ignoring any MAC errors etc.

To be able to try to recover a corrupted database the latter is definitely necessary.

Scripts are the next thing on my TODO list after the raw key support ...

TLDR Raw key support will help a bit in the near future, but external scripts are needed for proper data rescue and salt manipulation.

IMHO there is nothing wrong in adding methods to the library itself for those purposes.

@resilar
Copy link
Owner

resilar commented Oct 26, 2018

Commit 3d2c17b adds raw keys and support for specifying a salt. I'm leaving this issue open until we implement a way to read the salt (be it a C function or a Python script).

I like the idea because it allows program developers, who use sqleet as a library, to take full responsibility of the key derivation process.

The reasoning is somewhat illogical. On the one hand you want to forbid that a user changes configuration parameters at runtime, on the other hand you want to allow complete freedom at runtime. Why not allow changing configuration parameters at runtime as a compromise?

I do not want the average user to select crypto parameters because it introduces the risk of insecure configurations and compatibility issues (opening a database created with unknown parameters would be difficult even if knowing the correct key). Compile-time configuration is OK because it requires some effort to modify, so mostly advanced users who know what they are doing will use that for good reasons. I think that raw keys, although easily misused, fall into the latter category.

@utelle
Copy link
Contributor Author

utelle commented Oct 26, 2018

Commit 3d2c17b adds raw keys and support for specifying a salt.

I will incorporate this option into my sqleet variant.

I'm leaving this issue open until we implement a way to read the salt (be it a C function or a Python script).

I would prefer a C function as an extension to the API.

I do not want the average user to select crypto parameters because it introduces the risk of insecure configurations and compatibility issues (opening a database created with unknown parameters would be difficult even if knowing the correct key). Compile-time configuration is OK because it requires some effort to modify, so mostly advanced users who know what they are doing will use that for good reasons. I think that raw keys, although easily misused, fall into the latter category.

My experience is that the average user doesn't tamper with any configuration options, neither at compile time nor at runtime. Offering a raw key/salt option makes it extremely easy for such users to misuse this option. However, since libraries like SQLCipher offer raw keys, adding this option to sqleet as well is certainly reasonable.

For the experienced user it may be convenient to be able to use configuration options at runtime as an alternative ro implementing the complete key derivation process himself. However, I respect your decision not to follow this route.

@utelle
Copy link
Contributor Author

utelle commented Dec 11, 2018

I'd like to get back to this issue about accessing the key salt.

Just recently SQLCipher 4.0.0 was released. In addition to now using SHA512 for key derivation and HMAC a new pragma cipher_plaintext_header_size was introduced. This new pragma addresses SQLCipher issue 255 which is related to a certain iOS use case where iOS applies special handling if an application uses a SQLite database in WAL mode. However, iOS identifies SQLite database files by inspecting the first 16 bytes of the database header and looking for the SQLite format identification string (SQLite format 3\000), and therefore these bytes have to be unencrypted. In default operation mode SQLCipher uses the first 16 bytes of the database header for storing the key salt - just as sqleet. However, if the new pragma is applied, the key salt can't be stored any longer in the database file. Instead the user has to retrieve the key salt using another pragma (cipher_salt), and to set in each time on opening the database.

I can't judge how important it would be to support this iOS use case, but obviously the SQLCipher team found it important enough to introduce additional pragmas.

If one wants to support this iOS use case then IMHO it would be necessary to provide methods to retrieve and set the key salt independently from the key itself. Currently sqleet supports only a combination of raw key plus salt.

What's your opinion?

@resilar
Copy link
Owner

resilar commented Dec 18, 2018

Hmm... That's an interesting problem that I was unaware of. Thanks for notifying me.

AFAIK the iOS use-case of sqleet is not very popular and there already exist (poor) workarounds, so the issue is not critical. I'd prefer not to add a new PRAGMA like SQLCipher mainly due to the maintenance burden (IIRC SQLite3 offers no public/stable API to add custom PRAGMAs). Cloning sqleet master and replacing old SQLite3 amalgamation with the latest version should ideally work "out-of-the-box" (so far this has been true except for a couple rekeyvacuum.c updates and added sqlite3BtreeBeginTrans() NULL parameter in 3.25.1). The recent SQLite3 <v3.26.0 RCE vulnerability shows how important this "feature" is in security-sensitive applications.

I think there are other feasible solutions. For example, a more powerful interface for specifying salt and key using existing PRAGMAs:

PRAGMA key='header:SQLite format 3\x00'     # File header
PRAGMA key='salt:cipher_salt_of_any_length' # PBKDF2 salt (not stored in the header)
PRAGMA key='password_for_key_derivation'    # PBKDF2 password

I agree that setting PRAGMA key three times can look a bit confusing. Alternatively we could provide the various parameters via query parameters of URI filenames. This would also have the advantage of automatically handling URI-encoded hex values in a standard way. However, escaping of special characters (such as &) in query parameter values could become a problem.

Another approach would be to write a compatibility-VFS shim for iOS (see The SQLite OS Interface or "VFS", and in particular appendvfs.c).

There are probably more solutions as well. I will have to think more about this issue before deciding which approach to take. Your (and everyone else's) input is welcome, as always.

@utelle
Copy link
Contributor Author

utelle commented Dec 18, 2018

AFAIK the iOS use-case of sqleet is not very popular and there already exist (poor) workarounds, so the issue is not critical.

AFAICT this iOS issue affects all SQLite encryption extensions (even the official one), if the database file signature (first 16 bytes) plus the first part of the database header (next 8 bytes) are not in plain text.

I have no idea whether anybody is actually using my own SQLite encryption extension (wxSQLite3) in an iOS environment and whether this issue imposes a real problem. At least up to now no one contacted me on this behalf.

So, I agree that it seems not to be a critical issue.

I'd prefer not to add a new PRAGMA like SQLCipher mainly due to the maintenance burden (IIRC SQLite3 offers no public/stable API to add custom PRAGMAs).

Indeed, adding custom PRAGMAs requires changes to the SQLite source code. And I, too, prefer to not touch the original SQLite code.

However, it is not very difficult to accomplish the tasks of a PRAGMA command using a SELECT command and adding a user-defined function. This is the approach I chose for wxSQLite3.

Cloning sqleet master and replacing old SQLite3 amalgamation with the latest version should ideally work "out-of-the-box" (so far this has been true except for a couple rekeyvacuum.c updates and added sqlite3BtreeBeginTrans() NULL parameter in 3.25.1). The recent SQLite3 <v3.26.0 RCE vulnerability shows how important this "feature" is in security-sensitive applications.

I fully agree. Being able to update the underlying SQLite implementation easily and fast is critical.

I think there are other feasible solutions. For example, a more powerful interface for specifying salt and key using existing PRAGMAs:

PRAGMA key='header:SQLite format 3\x00'     # File header
PRAGMA key='salt:cipher_salt_of_any_length' # PBKDF2 salt (not stored in the header)
PRAGMA key='password_for_key_derivation'    # PBKDF2 password

Hm, yes, it could be done that way, but this is sort of hijacking PRAGMA key=... and could easily cause mistakes in using it. For example, this approach requires some extra housekeeping, because one has to remember all pieces of information until the underlying codec is actually attached to the database - that is, the order of the PRAGMAs could matter. Of course, the housekeeping can't be avoided even with different PRAGMA names, but it would be a bit less errorprone.

I agree that setting PRAGMA key three times can look a bit confusing. Alternatively we could provide the various parameters via query parameters of URI filenames. This would also have the advantage of automatically handling URI-encoded hex values in a standard way. However, escaping of special characters (such as &) in query parameter values could become a problem.

In the latest version of wxSQLite3 I added support for passing codec parameters via URI query parameters. This works very smoothly and is less cumbersome than using a bunch of PRAGMAs (or SELECTs with user-defined functions). IMHO it should therefore be the preferred method, because it eliminates the need for extra housekeeping - all URI parameters are directly accessible when attaching the codec.

IMHO escaping special characters is an almost neglectable problem. Or at least, it doesn't impose a new problem, because one has to escape certain special characters, too, when using a PRAGMA. Regarding raw key and salt values it would be best to pass them as hex values anyway.

Another approach would be to write a compatibility-VFS shim for iOS (see The SQLite OS Interface or "VFS", and in particular appendvfs.c).

At the moment I don't see how this would help to overcome the mentioned iOS issue (where iOS simply looks at the database file itself without using any SQLite code). The best solution to the iOS issue would be to change the way how iOS identifies SQLite database files. However, this is way beyond our scope.

There are probably more solutions as well. I will have to think more about this issue before deciding which approach to take. Your (and everyone else's) input is welcome, as always.

For the SQLCipher support in wxSQLite3 I added measures to handle the database format variant that was introduced to overcome the iOS issue. However, I didn't add any special handling yet for the other cipher schemes supported by wxSQLite3.

While implementing support for the latest SQLCipher version I stumbled across other issues. For example, SQLCipher adds its own memory handling methods. These methods do 2 things: (1) allocated SQLite memory is locked, so that the OS doesn't swap it to disk, and (2) freed SQLite memory is overwritten with zeros. I doubt that (1) really works under all circumstances, since the OS usually limits the amount of memory that can be locked.

@resilar
Copy link
Owner

resilar commented Dec 20, 2018

Hm, yes, it could be done that way, but this is sort of hijacking PRAGMA key=... and could easily cause mistakes in using it. For example, this approach requires some extra housekeeping, because one has to remember all pieces of information until the underlying codec is actually attached to the database - that is, the order of the PRAGMAs could matter

The extra housekeeping does not necessarily require too much extra logic if implemented carefully and the basic usage would remain simple, i.e., only setting the key once. The PRAGMA ordering is indeed mistake-prone, but IMO the order requirements are intuitive enough (the only thing that matters is setting the password after other settings). However, I'm still unsure about how rekey should work under this model.

However, it is not very difficult to accomplish the tasks of a PRAGMA command using a SELECT command and adding a user-defined function. This is the approach I chose for wxSQLite3.

Sounds reasonable. I will have to look into this. Maintaining UX compatibility between sqleet and wxSQLite3 would also be nice.

In the latest version of wxSQLite3 I added support for passing codec parameters via URI query parameters. This works very smoothly and is less cumbersome than using a bunch of PRAGMAs (or SELECTs with user-defined functions). IMHO it should therefore be the preferred method, because it eliminates the need for extra housekeeping - all URI parameters are directly accessible when attaching the codec.

IMHO escaping special characters is an almost neglectable problem. Or at least, it doesn't impose a new problem, because one has to escape certain special characters, too, when using a PRAGMA. Regarding raw key and salt values it would be best to pass them as hex values anyway.

... and this sounds even more reasonable. :)

Another approach would be to write a compatibility-VFS shim for iOS (see The SQLite OS Interface or "VFS", and in particular appendvfs.c).

At the moment I don't see how this would help to overcome the mentioned iOS issue (where iOS simply looks at the database file itself without using any SQLite code)

The idea is to use a VFS shim to add an extra SQLite3 format 3\x00 header in the beginning of the encrypted database file. The VFS shim could then redirect SQLite3's reads & writes by an offset of +16, basically skipping the fake header. As a result, iOS would recognize the database file as a SQLite3 database, but from the perspective of SQLite3/sqleet, the file would look like a regular encrypted database when accessed through the VFS shim.

The 16-byte offset may incur a noticeable performance cost though. Extending the fake header to database page boundary might help in that regard (or not).

But yeah, this feels like a dirty iOS-specific hack and not a proper solution.

The best solution to the iOS issue would be to change the way how iOS identifies SQLite database files. However, this is way beyond our scope.

I agree. The goal here is to improve the library interface in general instead of introducing special logic for supporting broken platforms which do weird things like iOS.

While implementing support for the latest SQLCipher version I stumbled across other issues. For example, SQLCipher adds its own memory handling methods. These methods do 2 things: (1) allocated SQLite memory is locked, so that the OS doesn't swap it to disk, and (2) freed SQLite memory is overwritten with zeros.

sqleet defines SQLITE_TEMP_STORE=2 to avoid storing temporary files on disk. Obvious sensitive information such as derived encryption keys are overwritten with zeros when no longer used.

SQLCipher's more thorough approach is common in security industry, but I personally do not think that the security-performance trade-off is worth it in the context of SQLite3. If the attacker has access to the swap file, (s)he can usually also read the SQLite3 process memory directly in one way or another. And besides encryption keys & decrypted database pages, there should not be anything security-critical information laying around in unused heap memory (however, I'm not an expert when it comes to SQLite3's internal memory management so I might be very wrong).

... I just now realized that sqleet does not encrypt database pages in-place, which may leave cleartext copies of database pages in SQLite3's (heap) memory. My reasoning behind this design decision was that SQLite3 might try to erroneously access the encrypted buffer, which could crash the program. Of course this is very unlikely case (and would indicate a bug in SQLite3's code), and certainly not a good enough reason to leave unencrypted copies of database pages in the memory. I'm fixing this ASAP (although the possible information leak is miniscule and requires use-after-free or memory leak vulnerability to actually exploit). EDIT Apparently in-place encryption is not trivial and as straightforward as it should be. It seems to at least break rekeying an encrypted database with a new key. I'll investigate this further when I have time... But it looks like a limitation in the SQLite3 codec API. I'll try contacting the upstream if there is no easy workaround for this.

Defining custom memory handling methods is easy though (see Memory Allocation Routines). So making SQLCipher-style memory allocation behavior optional with compile-time or run-time configuration setting would be a "Pareto-optimal" solution, which I might consider in the future. The potential benefit is insignificant in my estimation, but if some users want it, I'm not against implementing custom memory routines.


Anyway, commit 957fbb8 flags the current raw key/salt interface as experimental, which I should've done when first introducing it. I might replace the interface with something better based on our discussions if preserving it does not make sense after the changes. Currently URI query parameters feel like the best method for passing extra settings.

Implementation is coming perhaps sometime in January. Happy holiday season, don't spend it all in GitHub. 🌲

@utelle
Copy link
Contributor Author

utelle commented Jan 2, 2019

The extra housekeeping does not necessarily require too much extra logic if implemented carefully and the basic usage would remain simple, i.e., only setting the key once. The PRAGMA ordering is indeed mistake-prone, but IMO the order requirements are intuitive enough (the only thing that matters is setting the password after other settings).

Instead of introducing multiple use of PRAGMA key= with various prefixes, one could introduce a special syntax to specify all parameters within a single invocation of the pragma. However, even with this simplification I don't think that this approach is a good one. Especially, the error handling would be cumbersome to say the least, because the SQLite code invoking the sqlite3_key interface does not check the return code, unfortunately.

However, I'm still unsure about how rekey should work under this model.

AFAICT the original purpose of sqlite3_rekey is only to allow to change the password/passphrase of the current encryption or to remove encryption. However, allowing to set/modify the key salt is certainly desirable, too.

Maintaining UX compatibility between sqleet and wxSQLite3 would also be nice.

Yes, indeed.

The idea is to use a VFS shim to add an extra SQLite3 format 3\x00 header in the beginning of the encrypted database file. The VFS shim could then redirect SQLite3's reads & writes by an offset of +16, basically skipping the fake header. As a result, iOS would recognize the database file as a SQLite3 database, but from the perspective of SQLite3/sqleet, the file would look like a regular encrypted database when accessed through the VFS shim.

Just adding an extra SQLite3 format 3\x00 header would not be enough, since at least the next 8 bytes are also inspected by iOS. And it would be necessary to keep these 8 bytes in sync with the actual database header.

The 16-byte offset may incur a noticeable performance cost though. Extending the fake header to database page boundary might help in that regard (or not).

But yeah, this feels like a dirty iOS-specific hack and not a proper solution.

Definitely.

sqleet defines SQLITE_TEMP_STORE=2 to avoid storing temporary files on disk.

This is the default for wxSQLite3 as well.

SQLCipher's more thorough approach is common in security industry, but I personally do not think that the security-performance trade-off is worth it in the context of SQLite3. If the attacker has access to the swap file, (s)he can usually also read the SQLite3 process memory directly in one way or another. And besides encryption keys & decrypted database pages, there should not be anything security-critical information laying around in unused heap memory (however, I'm not an expert when it comes to SQLite3's internal memory management so I might be very wrong).

Well, SQLite caches decrypted database pages in memory. That is, if these cached pages are written to a swap file, an attacker with access to the swap file could read the unencrypted database content at least partially.

However, even SQLCipher's approach isn't foolproof, since OSs usually limit the amount of memory that can be locked.

... I just now realized that sqleet does not encrypt database pages in-place, which may leave cleartext copies of database pages in SQLite3's (heap) memory.

IMHO this doesn't seem to add a major zone for attack, since SQLite caches cleartext database pages anyway.

Apparently in-place encryption is not trivial and as straightforward as it should be.

I really don't think it's worth the effort to implement in-place encryption, unless one prevents caching cleartext database pages as well. However, the latter would incur a major performance decrease.

Defining custom memory handling methods is easy though (see Memory Allocation Routines). So making SQLCipher-style memory allocation behavior optional with compile-time or run-time configuration setting would be a "Pareto-optimal" solution, which I might consider in the future. The potential benefit is insignificant in my estimation, but if some users want it, I'm not against implementing custom memory routines.

Well, yes, adding custom memory handling seems to be rather straight-forward. However, it might make sense to ask the user base whether there is an actual demand for such a feature.

Anyway, commit 957fbb8 flags the current raw key/salt interface as experimental, which I should've done when first introducing it.

Yeah, I should have done that, too, with (at least some of) the changes I introduced with version 4 of wxSQLite3.

Currently URI query parameters feel like the best method for passing extra settings.

In any case using URI query parameters is an established method to pass configuration settings to VFS implementations or the like. So, using this mechanism for encryption configuration is definitely reasonable.

Implementation is coming perhaps sometime in January. Happy holiday season, don't spend it all in GitHub. 🌲

I followed your advice. 😄

resilar added a commit that referenced this issue Mar 30, 2019
Supported URI query parameters by sqleet:

  * salt: Provides 16-byte salt for the key derivation function (KDF).
  * header: Overrides 16-bytes in the beginning of the database header.
  * kdf: Specifies KDF algorithm (only `none` value supported for now).
  * skip: Run-time setting overriding compile-time SKIP_HEADER_BYTES.
  * page_size: Equivalent to 'page_size' PRAGMA setting.

Parameters `salt` and `header` have corresponding hex-prefixed versions.
Both expect 16-byte values and shorter strings are 0-padded to 16 bytes.
The KDF salt is stored in the first 16 bytes of the database header if
`header` is undefined. Otherwise the value of `header` overwrites the 16
header bytes (the KDF salt can thus be hidden from the database file).

Old raw key interface is deprecated by `kdf=none` that disables the KDF.
When KDF is disabled, `key` and `rekey` PRAGMAs (and corresponding URI
parameters) accept a 32-byte value that becomes the "master" key which
is normally derived by the KDF. This allows the users of the library to
take full control of the key derivation process if needed.

There is not yet a user-friendly method to retrieve the currently used
salt or other settings. Moreover, changing the settings of an existing
database after its creation remains cumbersome. It can however be done
by first fully decrypting the database (with `PRAGMA rekey=''`) and then
encrypting it again (with `PRAGMA rekey='password'`) after re-opening
the database file with new URI parameters. Addressing these shortcomings
is a low-priority long-term goal given the workarounds and the lack of
low-complexity solutions to them without a major code refactoring.

README.md documentation of the URI API is left for a later commit.
resilar added a commit that referenced this issue Mar 30, 2019
Supported URI query parameters by sqleet:

  * salt: Provides 16-byte salt for the key derivation function (KDF)
  * header: Overrides 16-bytes in the beginning of the database header
  * kdf: Specifies KDF algorithm (only `none` value supported for now)
  * skip: Run-time setting overriding compile-time SKIP_HEADER_BYTES
  * page_size: Equivalent to `page_size` PRAGMA setting

Parameters `salt` and `header` have corresponding hex-prefixed versions.
Both expect 16-byte values and shorter strings are 0-padded to 16 bytes.
The KDF salt is stored in the first 16 bytes of the database header if
`header` is undefined. Otherwise the value of `header` overwrites the 16
header bytes (the KDF salt can thus be hidden from the database file).

Old raw key interface is deprecated by `kdf=none` that disables the KDF.
When KDF is disabled, `key` and `rekey` PRAGMAs (and corresponding URI
parameters) accept a 32-byte value that becomes the "master" key which
is normally derived by the KDF. This allows the users of the library to
take full control of the key derivation process if needed.

There is not yet a user-friendly method to retrieve the currently used
salt or other settings. Moreover, changing the settings of an existing
database after its creation remains cumbersome. It can however be done
by first fully decrypting the database (with `PRAGMA rekey=''`) and then
encrypting it again (with `PRAGMA rekey='password'`) after re-opening
the database file with new URI parameters. Addressing these shortcomings
is a low-priority long-term goal given the workarounds and the lack of
low-complexity solutions to them without a major code refactoring.

README.md documentation of the URI API is left for a later commit.
resilar added a commit that referenced this issue Mar 30, 2019
Supported URI query parameters by sqleet:

  * salt: Provides 16-byte salt for the key derivation function (KDF)
  * header: Overrides 16-bytes in the beginning of the database header
  * kdf: Specifies KDF algorithm (only `none` value supported for now)
  * skip: Run-time setting overriding compile-time SKIP_HEADER_BYTES
  * page_size: Equivalent to `page_size` PRAGMA setting

Parameters `salt` and `header` have corresponding hex-prefixed versions.
Both expect 16-byte values and shorter strings are 0-padded to 16 bytes.
The KDF salt is stored in the first 16 bytes of the database header if
`header` is undefined. Otherwise the value of `header` overwrites the 16
header bytes (the KDF salt can thus be hidden from the database file).

Old raw key interface is deprecated by `kdf=none` that disables the KDF.
When KDF is disabled, `key` and `rekey` PRAGMAs (and corresponding URI
parameters) accept a 32-byte value that becomes the "master" key which
is normally derived by the KDF. This allows the users of the library to
take full control of the key derivation process if needed.

There is not yet a user-friendly method to retrieve the currently used
salt or other settings. Moreover, changing the settings of an existing
database after its creation remains cumbersome. It can however be done
by first fully decrypting the database (with `PRAGMA rekey=''`) and then
encrypting it again (with `PRAGMA rekey='password'`) after re-opening
the database file with new URI parameters. Addressing these shortcomings
is a low-priority long-term goal given the workarounds and the lack of
low-complexity solutions to them without a major code refactoring.

README.md documentation of the URI API is left for a later commit.
@resilar
Copy link
Owner

resilar commented Mar 30, 2019

Oops, sorry for the reference spam.

Anyway, this took longer than expected. The implemented URI API in uri branch is experimental. It is reasonably tested and in a usable state, but some feedback and minor adjustments are needed before merging it to master (README.md documentation is also still missing). The commit message of 2de2f50 gives a good overview of the chosen approach.

Implement URI API for run-time configuration (#13)

Supported URI query parameters by sqleet:

  * salt: Provides 16-byte salt for the key derivation function (KDF)
  * header: Overrides 16-bytes in the beginning of the database header
  * kdf: Specifies KDF algorithm (only `none` value supported for now)
  * skip: Run-time setting overriding compile-time SKIP_HEADER_BYTES
  * page_size: Equivalent to `page_size` PRAGMA setting

Parameters `salt` and `header` have corresponding hex-prefixed versions.
Both expect 16-byte values and shorter strings are 0-padded to 16 bytes.
The KDF salt is stored in the first 16 bytes of the database header if
`header` is undefined. Otherwise the value of `header` overwrites the 16
header bytes (the KDF salt can thus be hidden from the database file).

Old raw key interface is deprecated by `kdf=none` that disables the KDF.
When KDF is disabled, `key` and `rekey` PRAGMAs (and corresponding URI
parameters) accept a 32-byte value that becomes the "master" key which
is normally derived by the KDF. This allows the users of the library to
take full control of the key derivation process if needed.

There is not yet a user-friendly method to retrieve the currently used
salt or other settings. Moreover, changing the settings of an existing
database after its creation remains cumbersome. It can however be done
by first fully decrypting the database (with `PRAGMA rekey=''`) and then
encrypting it again (with `PRAGMA rekey='password'`) after re-opening
the database file with new URI parameters. Addressing these shortcomings
is a low-priority long-term goal given the workarounds and the lack of
low-complexity solutions to them without a major code refactoring.

README.md documentation of the URI API is left for a later commit.

I agree that this is a bit different and less powerful than existing solutions in SQLCipher/wxSQLite3, but I believe it is adequate for most common use-cases. For example, iOS applications can use sqleet as follows:

[sqleet]% ./sqleet 'file:secrets.db?key=swordfish&salt=SodiumChloride42&header=SQLite%20format%203&skip=32'
SQLite version 3.27.2 2019-02-25 16:06:06
Enter ".help" for usage hints.
sqlite> CREATE TABLE f(x,y);
sqlite> .quit
[sqleet]% xxd secrets.db | head -n5
00000000: 5351 4c69 7465 2066 6f72 6d61 7420 3300  SQLite format 3.
00000010: 1000 0101 2040 2020 0000 0001 0000 0002  .... @  ........
00000020: 9d96 552c 3b57 e8fd 70e6 4631 a6c7 b0ce  ..U,;W..p.F1....
00000030: 0f9c f4f6 ef41 bc9e 926d f55e 7d7d 8835  .....A...m.^}}.5
00000040: d01b 5d08 a9fb 1c62 3db0 e814 6948 088c  ..]....b=...iH..

For illustration purposes, the same without header:

[def@arch sqleet]% ./sqleet 'file:secrets.db?key=swordfish&salt=SodiumChloride42&skip=32'  
SQLite version 3.27.2 2019-02-25 16:06:06                                        
Enter ".help" for usage hints.
sqlite> CREATE TABLE f(x,y);
sqlite> .quit
[def@arch sqleet]% xxd secrets.db| head -n5                                               
00000000: 536f 6469 756d 4368 6c6f 7269 6465 3432  SodiumChloride42
00000010: 1000 0101 2040 2020 0000 0001 0000 0002  .... @  ........
00000020: cd80 04f7 33dc 482c 75a0 b3d6 7c7c 1662  ....3.H,u...||.b
00000030: d7fc 2267 c33b 78bf 18cf 60fe e9da e27a  .."g.;x...`....z
00000040: b5b2 b355 4c96 9428 19b8 cb6d f3c8 4c12  ...UL..(...m..L.

Accessing the active salt is still unsupported by sqleet (the obvious workaround is to direcrtly read the 16-byte header of the database file). Changing the URI configuration of an existing database is also practically an unsupported operation, although it is possible with enough effort as described in the commit message. Ideally, sqleet would make these things easier. However, I'm uncertain about the cost-benefit ratio because I'd like to keep sqleet code as simple and auditable as possible (offering read access to the salt would be relatively easy, but updating the salt and other database configuration "on-the-fly" would be very tricky).

The code base is becoming increasingly complex. It is not yet too bad, but I'll try to do some refactoring after merging the URI API and before implementing more new features.

resilar added a commit that referenced this issue Apr 3, 2019
Supported URI query parameters by sqleet:

  * salt: Provides 16-byte salt for the key derivation function (KDF)
  * header: Overrides 16-bytes in the beginning of the database header
  * kdf: Specifies KDF algorithm (only `none` value supported for now)
  * skip: Run-time setting overriding compile-time SKIP_HEADER_BYTES
  * page_size: Equivalent to `page_size` PRAGMA setting

Parameters `salt` and `header` have corresponding hex-prefixed versions.
Both expect 16-byte values and shorter strings are 0-padded to 16 bytes.
The KDF salt is stored in the first 16 bytes of the database header if
`header` is undefined. Otherwise the value of `header` overwrites the 16
header bytes (the KDF salt can thus be hidden from the database file).

Old raw key interface is deprecated by `kdf=none` that disables the KDF.
When KDF is disabled, `key` and `rekey` PRAGMAs (and corresponding URI
parameters) accept a 32-byte value that becomes the "master" key which
is normally derived by the KDF. This allows the users of the library to
take full control of the key derivation process if needed.

There is not yet a user-friendly method to retrieve the currently used
salt or other settings. Moreover, changing the settings of an existing
database after its creation remains cumbersome. It can however be done
by first fully decrypting the database (with `PRAGMA rekey=''`) and then
encrypting it again (with `PRAGMA rekey='password'`) after re-opening
the database file with new URI parameters. Addressing these shortcomings
is a low-priority long-term goal given the workarounds and the lack of
low-complexity solutions to them without major code refactoring.

README.md documentation of the URI API is left for a later commit.
@resilar
Copy link
Owner

resilar commented Apr 17, 2019

Documentation and refactoring of the URI interface is still in progress. I have been busy lately.


I just realized that the new VACUUM INTO command can be used to change URI settings of an existing database fairly easily. The filename argument to INTO can be a URI filename with a new set of URI parameters. This is just an untested idea and most likely requires small adjustments to the current code (e.g., the logic that duplicates the main database's codec when attaching a database without a key should overwrite duplicated URI settings with newly parsed values).

resilar added a commit that referenced this issue Apr 24, 2019
Supported URI query parameters by sqleet:

  * salt: Provides 16-byte salt for the key derivation function (KDF)
  * header: Overrides 16-bytes in the beginning of the database header
  * kdf: Specifies KDF algorithm (only `none` value supported for now)
  * skip: Run-time setting overriding compile-time SKIP_HEADER_BYTES
  * page_size: Equivalent to `page_size` PRAGMA setting

Parameters `salt` and `header` have corresponding hex-prefixed versions.
Both expect 16-byte values and shorter strings are 0-padded to 16 bytes.
The KDF salt is stored in the first 16 bytes of the database header if
`header` is undefined. Otherwise the value of `header` overwrites the 16
header bytes (the KDF salt can thus be hidden from the database file).

Old raw key interface is deprecated by `kdf=none` that disables the KDF.
When KDF is disabled, `key` and `rekey` PRAGMAs (and corresponding URI
parameters) accept a 32-byte value that becomes the "master" key which
is normally derived by the KDF. This allows the users of the library to
take full control of the key derivation process if needed.

There is not yet a user-friendly method to retrieve the currently used
salt or other settings. Moreover, changing the settings of an existing
database after its creation remains cumbersome. It can however be done
by first fully decrypting the database (with `PRAGMA rekey=''`) and then
encrypting it again (with `PRAGMA rekey='password'`) after re-opening
the database file with new URI parameters. Addressing these shortcomings
is a low-priority long-term goal given the workarounds and the lack of
low-complexity solutions to them without major code refactoring.

README.md documentation of the URI API is left for a later commit.
@resilar
Copy link
Owner

resilar commented Apr 24, 2019

Commit 3fc3fc4 in uri branch adds README.md documentation of the URI interface.

Some work is still needed before merging the feature into master. In particular, support for changing URI settings with VACUUM INTO should be implemented and tested properly.

@resilar
Copy link
Owner

resilar commented Apr 29, 2019

Based on initial testing, the VACUUM INTO thing seems to work perfectly for changing URI settings of an existing database.

@resilar
Copy link
Owner

resilar commented Jun 4, 2019

Current uribranch now supports changing database URI settings via VACUUM INTO command.

$ rm -f secrets.db skipped.db
$ ./sqleet 'file:secrets.db?key=hunter2&salt=SALTsaltSALTsalt' 'CREATE TABLE f(x,y)'
$ xxd secrets.db | head -n3
00000000: 5341 4c54 7361 6c74 5341 4c54 7361 6c74  SALTsaltSALTsalt
00000010: dc01 5e20 8fc5 e8ae 52c3 83fb 44b0 1550  ..^ ....R...D..P
00000020: ea0b 8237 94b4 d99f 74af e80b 7bbd c537  ...7....t...{..7
$ ./sqleet 'file:secrets.db?key=hunter2' 'VACUUM INTO "file:skipped.db?skip=32"'
$ xxd skipped.db | head -n3
00000000: 5341 4c54 7361 6c74 5341 4c54 7361 6c74  SALTsaltSALTsalt
00000010: 1000 0101 2040 2020 0000 0001 0000 0002  .... @  ........
00000020: 93f7 e5f7 e31e bc9c ae57 7f48 39fc 8498  .........W.H9...

Salt is stored in the DB header so the salt parameter can be omitted in the latter command. Source database values of URI parameters salt, header, page_size, skip and kdf are preserved unless explicitly overridden in the INTO URI filename

URI parameter key allows (re-)encrypting & decrypting databases. Unfortunately, changing page_size of an existing database is unsupported: VACUUM INTO calls sqlite3RunVacuum() which overwrites page size and reserved size without our control. This is basically the same issue that occurs in rekeying, but in this case we cannot solve it with sqlite3RekeyVacuum() since VACUUM INTO calls sqlite3RunVacuum() directly. However, this is only a minor limitation and most people can probably live with fixed database page size (or change it by various other means).


URI-based configuration is finally getting pretty close to the finish line. No further code refactoring is needed in my opinion. Only README.md needs some updating.

resilar added a commit that referenced this issue Jun 6, 2019
Supported URI query parameters by sqleet:

  * salt: Provides 16-byte salt for the key derivation function (KDF)
  * header: Overrides 16-bytes in the beginning of the database header
  * kdf: Specifies KDF algorithm (only `none` value supported for now)
  * skip: Run-time setting overriding compile-time SKIP_HEADER_BYTES
  * page_size: Equivalent to `page_size` PRAGMA setting

Parameters `salt` and `header` have corresponding hex-prefixed versions.
Both expect 16-byte values and shorter strings are 0-padded to 16 bytes.
The KDF salt is stored in the first 16 bytes of the database header if
`header` is undefined. Otherwise the value of `header` overwrites the 16
header bytes (the KDF salt can thus be hidden from the database file).

Old raw key interface is deprecated by `kdf=none` that disables the KDF.
When KDF is disabled, `key` and `rekey` PRAGMAs (and corresponding URI
parameters) accept a 32-byte value that becomes the "master" key which
is normally derived by the KDF. This allows the user of the library to
take full control of the key derivation process.

`VACUUM INTO` (introduced in SQLite 3.27.0) supports changing the URI
settings of an existing database by overriding the URI parameter values
in the `INTO` URI filename. Unspecified parameters inherit their values
from the main database vacuumed by the command.

There is not yet a user-friendly method to retrieve the currently used
salt or other settings as required by #13. This is a low-priority task.
@thecodrr
Copy link

@utelle we are facing this very problem while using SQLiteMultipleCiphers. I know this project is discontinued but were you able to or think of adding support for plaintext header like SQLCipher in SQLiteMultipleCiphers?

@utelle
Copy link
Contributor Author

utelle commented Feb 12, 2024

@utelle we are facing this very problem while using SQLiteMultipleCiphers. I know this project is discontinued but were you able to or think of adding support for plaintext header like SQLCipher in SQLiteMultipleCiphers?

Up to now no one has asked for a "plaintext header" feature in SQLite3 Multiple Ciphers for other cipher schemes than SQLCipher. That is, this feature is currently only implemented for the SQLCipher cipher scheme.

If you need this feature for other cipher schemes, please open a SQLite3 Multiple Ciphers issue, and I will see if and when it can be added.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants