-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SQLite API #11657
Comments
This has been discussed extensively by the core team in recent weeks. We won't expose bindings directly on the In addition to that |
Will the move into |
@waynebloss it will not, what @bartlomieju commented on is moving https://deno.land/x/sqlite@v3.0.0 under the std namespace. For what its worth, if there was a vote on whether we should expose bindings or move a wasm implementation into std, I would choose the former. I am not a contributor though, so I respect that this is a decision of the owners of deno 🙂. If sqlite were a standard web interface, this would be a much simpler conversation |
I think its tricky. What’s really nice about the WASM based solution is that it has a clear runtime-independent API it expects. This API can then be provided in e.g. deno deploy, but its also possible to use the same library in a web browser etc. On the other hand, piping all the IO through a JS translation layer is pretty expensive. |
Thanks. Yeah I was just wondering - I suppose it won't much matter where I'm importing it from. I'm glad the native binding is there when it's necessary and it seems like |
Yes sqlite3 has a nice approach. But I’d suggest to use the WASM bindings unless you run into performance problems in your use-case. They have the major advantages of a) preserving the deno security model and b) preserving consistency across deploys / environments (ie they provide a specific version of SQLite not just whichever happens to be installed by your system) |
Alright, so whether or not the following information is useful, I am dumping it here anyways because I have done a bit of reading on past, present and future web db stories. The pastIt looks like there was an attempt to expose sqlite in the browser back in 2010, but the specification flopped because sqlite was a single implementation, not a standard interface https://www.w3.org/TR/webdatabase/
The presentSo node.js has sql.js (another wasm sqlite lib) and absurd-sql, which use IndexedDB as a backend. @dyedgreen's Personally, I do not have a need for browser-compatible database drivers. If I am using sqlite in a deno/nodejs project, it is because I want the data to be portable to other tools, not the implementation. It is important therefore that my data be persisted to the filesystem, not stored in IndexedDB. benchmarksThe ffi lib wins out in speed in most cases, except select many queries that return objects (likely because there isnt a native solution yet, it just uses benchmark results
benchmark codeimport * as fs from "https://deno.land/std@0.138.0/fs/mod.ts";
import * as SQLITE_WASM from "https://deno.land/x/sqlite/mod.ts";
import * as SQLITE_FFI from "https://deno.land/x/sqlite3@0.4.2/mod.ts";
const SELECT_MANY_SEED_COUNT = 10000
const deno_bench = Deno.bench
export interface Statement {
execute(...params: unknown[]): void;
query_one(...params: unknown[]): any
query_many(...params: unknown[]): unknown[]
finalize(): void;
}
type Params = {[key: string]: any}
abstract class Backend<DB> {
public abstract backend_name: string
protected db: DB
constructor(protected db_path: string) {
if (fs.existsSync(db_path)) Deno.removeSync(db_path)
if (fs.existsSync(db_path+"-wal")) Deno.removeSync(db_path+"-wal")
if (fs.existsSync(db_path+"-shm")) Deno.removeSync(db_path+"-shm")
this.db = this.create_db(db_path)
}
abstract create_db(db_path: string): DB
abstract prepare(sql: string): Statement
abstract execute(sql: string): void
abstract query(sql: string): unknown[]
abstract close(): void
}
class SqliteWASM extends Backend<SQLITE_WASM.DB> {
backend_name = "deno.land/x/sqlite (wasm)"
create_db(db_path: string) { return new SQLITE_WASM.DB(db_path) }
prepare(sql: string) {
const stmt = this.db.prepareQuery(sql);
return {
execute: (params: Params) => { stmt.execute(params as any) },
query_one: (params: Params) => stmt.allEntries(params)[0],
query_many: (params: Params) => stmt.allEntries(params),
finalize: () => stmt.finalize(),
};
}
execute(sql: string) {
this.db.query(sql)
}
query(sql: string) {
return this.db.query(sql)
}
close() { this.db.close() }
}
class SqliteFFI extends Backend<SQLITE_FFI.Database> {
backend_name = "deno.land/x/sqlite3 (ffi)"
create_db(db_path: string) { return new SQLITE_FFI.Database(db_path) }
prepare(sql: string) {
const stmt = this.db.prepare(sql)
return {
execute: (params: Params) => {
stmt.execute(params)
},
query_one: (params: Params) => {
stmt.bindAllNamed(params)
stmt.step()
const row = stmt.row.asObject()
stmt.reset()
return row
},
query_many: (params: Params) => {
return this.db.queryObject(sql, params)
// const rows = []
// stmt.bindAllNamed(params)
// let row
// while (row = stmt.step()) {
// rows.push(row.asObject())
// }
// stmt.reset()
// return rows
},
finalize:() => stmt.finalize()
}
}
execute(sql: string) {
this.db.execute(sql)
}
query(sql: string) {
return this.db.queryArray(sql)
}
close() { this.db.close() }
}
function benchmark(db: Backend<unknown>) {
db.execute(`
CREATE TABLE basic (
id INTEGER PRIMARY KEY NOT NULL,
float REAL NOT NULL,
big_chunk BLOB NOT NULL
);
`)
db.execute(`CREATE INDEX float_index ON basic(float);`)
db.execute('PRAGMA journal_mode = WAL')
db.execute('PRAGMA synchronous = normal')
db.execute('PRAGMA temp_store = memory')
db.execute('PRAGMA page_size = 512')
db.execute('PRAGMA cache_size = 4000')
const insert_stmt = db.prepare(`INSERT INTO basic (float, big_chunk) VALUES (:float, :big_chunk)`)
const select_many_stmt = db.prepare(`SELECT * FROM basic`)
const select_one_stmt = db.prepare(`SELECT * FROM basic WHERE float <= :float LIMIT 1`)
const count_stmt = db.prepare(`SELECT COUNT(*) AS count FROM basic`)
db.execute(`DELETE FROM basic`)
for (let i = 0; i < SELECT_MANY_SEED_COUNT; i++) insert_stmt.execute({ float: Math.random(), big_chunk: '' })
deno_bench({
name: `${db.backend_name} - select count`,
group: "select count",
fn: () => {
const {count} = count_stmt.query_one()
if (count !== SELECT_MANY_SEED_COUNT) throw new Error(`unexpected count ${count}`)
}
})
deno_bench({
name: `${db.backend_name} - select many`,
group: "select many",
fn: () => {
const rows = select_many_stmt.query_many()
}
})
deno_bench({
name: `${db.backend_name} - select one`,
group: "select one",
fn: () => {
const row = select_one_stmt.query_one({ float: Math.random() })
}
})
deno_bench({
name: `${db.backend_name} - insert one`,
group: "insert one",
fn: () => {
insert_stmt.execute({float: Math.random(), big_chunk: ''})
}
})
}
const sqlite_wasm = new SqliteWASM('sqlite_wasm.db')
const sqlite_ffi = new SqliteFFI('sqlite_ffi.db')
benchmark(sqlite_wasm)
benchmark(sqlite_ffi) I dont have a great way to benchmark Part of this may have to do with WebAssembly being limited to 4GB of memory, though I suspect that WASM just runs slower than native in a lot of cases. There is a writeup on why the 4GB limitation exists (currently, but not forever!) here https://v8.dev/blog/4gb-wasm-memory. The futureIt looks like some database-friendly file system apis are coming to the browser soon https://web.dev/file-system-access/#accessing-files-optimized-for-performance-from-the-origin-private-file-system, which could give developers the best of both worlds (e.g. portability of implementation, and of data)
WebAssembly also has a proposal for essentially unbounded RAM usage in Stage 3 https://github.com/WebAssembly/proposals#phase-3---implementation-phase-cg--wg (it is called memory64) My Two CentsBoth portability and performance are important pillars for building resilient, easily distributed js apps. Currently, a wasm approach gives us portability at a cost to performance. The FFI approach gives us the opposite, performance at the cost of portability. A native-builtin deno solution would give us the best of both worlds, but it comes with its own pain points for the core deno team. E.g. (maintenance, polluting the codebase, lack of flexibility in sqlite extensions & versions). |
To add a note to this: The index db is used purely as a persistent layer, basically as a key-value store from 'file name' to an opaque Uint8Array, which holds the sqlite database. This representation is fully compatible, i.e. all you have to do is load the data out of the indexed db and into a file to read it elsewhere. (E.g. https://dyedgreen.github.io/tasks/dist/ has a backup button, which just downloads the database file for further use) |
Maybe another issue to mention with having a built-in SQLite / using a native SQLite with potentially untrusted code, is that it would break Deno's sandbox: https://sqlite.org/appfunc.html#security_implications |
dang that isnt something I considered...though I guess it depends on what kind of api is exposed. I imagine if registered functions were restricted to js function, then deno would still own the runtime? |
Yeah, but generally, having SQLite (i.e. a more or less opaque blob of C code) interact with the file-system and os means that there is a lot more surface for permissions etc to go wrong. E.g. how do you prevent people from making a database file have write access. How do you make sure only files allow-listed directories are read, etc ... (also see https://sqlite.org/security.html) |
That is a fair point that I hadnt considered. I imagine code execution is a compiler flag though. I think outside of that deno could handle read/write permissions on the few well defined sqlite files that it needs to access (e.g. |
I and @bartlomieju are supportive of native SQLite bindings. We've already discussed this internally multiple times and ended up not doing it but there is enough community feedback to assume that this is wanted. I'm going to open it again for discussion. |
Regarding performance, I'm currently working on leveraging fast FFI calls in @DjDeveloperr's Initial results show that querying data are almost 5x faster than Node's The |
And I am not supportive. We have for a long time provided a runtime where any non standard APIs are kept in the Adding non-standard APIs only fractures the ecosystem. It causes people to write code that only works on a single runtime. We shouldn't further fracture the ecosystem. I can understand a desire to do cool things, but history over the last 20 years of the JavaScript ecosystem demonstrates that these ideas never end up going well. We should do things that are part of a standard based approach. Also, while we shouldn't make a decision purely based on Deno Deploy, we need to honestly consider if this is something we introduce that fractures the workloads you can run on Deploy versus CLI that are built-in. Keeping it as an official maintained WASM module, or a FFI plugin is in my opinion the best approach. It decoupled the runtime from a specific version of SQLite as well as makes it totally opt-in. There is a web standard, IndexedDB, that we have failed to deliver. Delivering IndexedDB natively in CLI and Deploy would allow a lot of existing code to just work. We really should be pulling out all stops to deliver it. Yes, SQLite is more popular, yes WebSQL was a better idea, but that argument was had a decade a go and the web standard is IndexedDB, something we should absolutely have in Deno. |
This sounds interesting, since a standard library module can be updated standalone instead of a next Deno release 🤔 however not pretty sure which alternative could have more performance; if WASM of FFI? Also FFI currently is unstable so it will require using the flag while using the module. |
Since the discussion is opened up again, I will make my case again. addressing design pains
I understand the argument against non-standard apis. It is fair to say that adding sqlite natively will cause a decent amount of bikeshedding regarding the "right" sqlite version to expose from deno. I dont think that should be a blocker though. I think reaching a consensus on a sqlite version that covers 90% of the use cases of sqlite should be pretty easy. Deno also has an unstable flag, so it should be easy enough to expose sqlite for a few releases, gather feedback, and then harden the api.
One more followup here, the last 10% of the developer community will always still have the option to roll their own sqlite ffi lib or wasm lib if they want some very specific sqlite features that deno doesnt provide standard. addressing fracturing the runtime
so, I understand this argument, but at the same time, the runtime is already fractured, and its a good thing. I can do things like addressing security
I dont like the idea of breaking deno's sandbox, but both advocating based on prior arthttps://bun.sh Personally, I think that this project just wont have the same impact that deno has on the javascript community. It seems like there are less hard rules about its design, and instead it just picks and chooses what works well. That said, there could be something to be said that sqlite is a native addition to the runtime. Another runtime that includes sqlite natively is I feel like this is a strong argument against some of the web standards argument. A big strength of deno is its "batteries included" nature. The standard lib is invaluable, and certainly makes some assumptions about what good library apis look like. The difference here is just the location of this standard tool, being inside the runtime itself rather than in a standard lib advocating for performanceIll be honest, I think that we have people working hard at both ffi and wasm implementations of sqlite, but I just dont see a way that those will reach the same speeds that a maintained internal sqlite will reach. I want to use deno for extremely low latency tools for websocket/udp/tcp applications. The better performance on the database layer, the better the application. You would be amazed how much traffic you can get away with on a single web server with a good sqlite driver. ok, do we need a standard?This is really the most obnoxious thought here, but if it is going to take a web standard to get this feature into deno, could deno back the idea of a sqlite web standard? I believe I remember a blog post by deno that they are now part of the standards committee. Its been a long time since WebSQL was originally proposed, maybe the web committees are ready for this now? ok, do we need a poll?
if this is the case, is there any way to measure community desire for this feature? Is it just number of reactions/comments on this issue? Should there be a discussion poll? Apologies for the long, long post. I find sqlite very useful and would love to see it in deno. |
Correct. There are a few distinctions here:
The existing wasm sqlite integration builds on top of these bindings, even the FFI would go through these bindings. Keeping the bindings low level and as limited as possible should be the goal here.
Why are not those solutions acceptable for everyone? Especially if the FFI solution, as it is expected to be, will be similar performance to a fully native solution. It becomes totally opt-in, specific to users workloads.
It isn't a good thing. History has demonstrated it isn't. Part of Deno's participation WinterCG is to help ensure that server runtimes can move towards a standards based approach for things that only apply to server runtimes. Almost every challenge we face in adoption of Deno can be tied back to challenges working in the larger eco-system and people's expectations that "well it was ESM code so it should just work".
Which is one of the things we think is different about Deno. It is more intentional about its design, advocating for a standards based approach.
There are several standards committees. TC39 governs ECMAScript, the language syntax and behaviour. W3C runs the WHATWG, which produces the higher order web API standards, like IndexedDB. The WinterCG is the server side focused working group. Deno is a member of TC39 and WinterCG. Deno participates in the WHATWG on specific proposals. There were valid reasons why WebSQL got killed off, mostly that it was too coupled to the implementation. It is a good goal of the WHATWG to ensure that APIs are not coupled to closely to an implementation, because they should be more generalised. What is needed in my opinion is straight forward key value storage that isn't IndexedDB and has a modern API that works well in the sandboxed and security model that works in browsers. That wouldn't be SQLite either. A few years ago Google tried to advocate for KVStorage but abandoned it. I personally wouldn't want to advocate for something that I don't believe would be good for the web platform. SQLite is popular, SQLite is good, but baking the bindings directly into the runtime like the way Bun has means it becomes only a server side solution. A good solution in this space would be able to work well in a browser or in a server runtime, which means it needs to work well in the browser security model (supporting origin based quota management) and other things which are not instrinsic to SQLite. SQLite, like a lot of persistence solutions, is very workload specific. Why not Redis for example? Deno should focus on ensuring the low-level APIs are available and that people can create libraries that perform well across the spectrum, while trying to be as conservative about exposing APIs for everyone, especially ones that are not part of the web platform. |
Hello - just want to chime in with some opinions from the point of view of one common developer:
These are my opinions. No offense meant to anyone. Thanks! [0] - Speaking of finding third party modules - I wish they would switch back to displaying the number of stars over at |
Just chiming in, since I created the original issue. I still think a native option would be best for a lot of use cases. That said, when I made the request was before FFI was really even an option, and it may be good enough at this point, while native would still be faster. In the end, the ability to have multiple processes working against the same db file would still be a real nice to have (don't think the wasm will ever be able to do this). Even then, fracturing isn't something I'm so worried about, as there isn't a single standard Sqlite client interface as it stands, and even with the exteral modules, there still isn't and won't be. As to the runtime size, if the library itself is already baked in, exposing another interface to it would be negligible in terms of additional size. In the end, I think the FFI implementation is probably good enough, but in the box would be better. edit: Also, in the box would allow compliance with the existing read/write limitations while not opening a gaping hole in the security model for FFI permissions. |
I still think that its fair to say that deno has server-specific apis. Deno is being built around server-side workflows. Sure its possible that the deno team plans to support more interesting platforms in the future, but that shouldnt discount how important these are. What about I will grant you that SQLite is not a one-size-fits-all solution. There will be plenty of real world apps that need more powerful databases, like elasticsearch, redis, postgres, mongo, etc. We still need drivers for those. The case for native sqlite however, comes down to the fact that A) rusqlite is already bundled with deno, interfaces are the only thing that need to change |
rusqlite currently being in the binary is an implementation detail. It currently being in the binary is not a reason to expose this as part of Deno's runtime API IMO because it's possible we might remove it for a better solution/api in the future (though unlikely). I agree with Kitson's comments above. |
FYI https://github.com/denodrivers/sqlite3 is currently looking good for high performance needs, almost on par with C. I'm fairly confident that a native SQLite won't be faster than this. |
Imo at this point ffi sqlite speed is pretty darn good, but what we are lacking here is a deno mechanism for delivering native binaries (or even compiling them some day). Currently, deno implements npm's binary retrieval mechanism, which solves a lot of these issues for npm modules, but not for deno modules. If I want to use https://deno.land/x/sqlite3 for instance, I need to I think the general explanation here is that "if youre going to allow native code, youre breaking the security sandbox anyways". I think there should be a better way to say "Im allowing this bit of untrusted code to run" rather than just opening the floodgates to anything whenever we use ffi. |
https://docs.deno.com/kv/manual/on_deploy |
With node:sqlite landing in Node.js, it's probably time to re-evaluate the stance on this one... |
I had to try 😅 |
Since SQLite is being used to back WebStorage, it would be nice if this were exposed. Since the engine is already embedded, and the process for including binary modules is pretty cumbersome, it would be nice to have an access for SQLite closer to directly. Would also be able to keep within the read-write permission constraints this way as well, where a binary module could/would be more relaxed.
While deprecated, the WebSQL API might be a good option for this. I have made this suggestion previously, there still isn't a good story, especially if you need multi-process access to a database on disk, so the wasm module is inappropriate for many scenarios.
All query/exec methods are template string processors.
The text was updated successfully, but these errors were encountered: