-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prepared statements #14
Comments
Scoped Prepare
I remember them mentioning that they hook calls to Looking through their code, looks like they hook into the garbage collector by extending The call to I think this model works well with Scoped Prepare since as long as the SQL Cache Map
This can be problematic for a few reasons. If we ever allow users to "step" through statements (get one row at a time rather than all rows at once) this caching setup could easily break. const query1 = sql`SELECT * FROM foo`;
const query2 = sql`SELECT * FROM foo`;
// under the hood, query1 and 2 will bot point to the same prepared statement
while (query1.step()) {
while(query2.step()) {
// query1 and query2 are both stepping through the same result set rather than through independent result sets!
}
} Even if we never exposed If we can figure out how to clone a prepared statement when giving it out from the cache then I think this would be a good approach since:
That or some sort of ref counting. Each If we can't clone statements and users need to use scoped prepare and start throwing things into global scope to keep prepared statements around your example: const getUserByIDQuery = sql`SELECT * FROM users WHERE id = ${/* what do we put here? */}` would look like: const getUserByIDQuery = sql`SELECT * FROM users WHERE id = ?` Observable ParametersI like the idea (setting bind params with an index is not very fun) although it is orthogonal to statement preparation, correct? Seems like you'd still implement statement preparation through one of the two options above. |
Rusqlite has a statement cache. They solve the above issues by removing a statement from the cache while it is in use and putting it back when the caller is done. https://docs.rs/rusqlite/0.13.0/src/rusqlite/cache.rs.html#127 Idk how well this'll work in the browser where all access to SQLite is async. Seems like you could have many components concurrently asking for the same statement and having to prepare it since it would be removed from the cache by the first caller. We could get clever with queueing and waiting your turn before using the cached statement. I.e., if the statement you need is in use, you just queue up for it. Since SQLite in the browser is single-threaded it won't slow stuff down to do this. The TCL bindings also have a statement cache that, according to forum posts (https://users.rust-lang.org/t/sqlite-caching-prepared-statements-again/15626/19), does the same thing as the rust impl. The rust impl being based on the TCL one -- https://www.sqlite.org/src/artifact?ci=trunk&filename=src/tclsqlite.c |
Queueing sounds interesting, but it would result in a deadlock with the example you showed above: while (query1.step()) while(query2.step()) Removing statements from cache during execution sounds very clever, though.
If only we had more data on how it behaves in a real scenario... I suggest we mark statements as busy instead of removing them from cache. When a new call comes we can race current cached statement unlocking vs preparing a new one. This would result in the fastest execution, but not necessarily the least amount of work being done. |
Right :( This is only in the scenario where we expose a Maybe for Or hopefully I can figure out how to safely clone these things. |
Had a chat with @schickling today. He's not too interested in anything related to statement execution at the moment given they already have a specific execution model and framework (https://riffle.systems/essays/prelude/) so I don't think he'll be weighing in here. This is also part of the reason I pushed back on execution since Riffle is the first target use case. Looks like we can't clone statements so if we want to cache them we'll need to:
|
So, I take it as you don't like racing the queueing with preparing a new statement? |
Well I think we've entered the territory of making lots of assumptions about how things work without measuring or testing :) For synchronous backends, like better-sqlite3 or in-memory browser DBs, the normal caching mechanism of "remove while in use, put back" shouldn't be an issue. For browser backends we're assuming the "remove, put back" approach will cause lots of re-preparations of the same statement due to the async interface. Maybe we should test to see how true that is? I think racing will end up acting almost the same as queuing with the one advantage of being able to expose The one risk with racing is doing extra work so I'd like to know if we can avoid that. In theory we can cancel the losers of the race before they actually start executing. |
I put execution related stuff into https://github.com/vlcn-io/typed-sql/tree/main/incubator so I can start publishing the type related packages since those are ready for early adopters. The repo is also public now so the normal forking flow will work again. |
Sounds great, going public is a big step. :) I'm still working on prepared statements. I've been a bit busy lately, so things don't move as fast as I would want. I'll make a PR as soon as I have caching fully working (without eviction). Then we can do some tests in different environments and discuss the eviction strategy. |
Previously discussed here: #4 (review)
We need to think on the ways to implemented prepared statements in
typed-sql
. The good API should be:Here I want to explicitly discuss prepared statements in the executional part of
typed-sql
. Meaning if a user only uses query generation, they can already do prepared statements the way they want:We want to provide an opinionated way to prepare and run SQL queries that makes common use cases simple and complex cases possible.
The API
Given that we need to prepare a statement to execute it, we can save this work to not be done twice.
This satisfies simple requirement as there will be no difference in code with prepared statements vs without them from user's perspective. To satisfy flexibility, an advanced user might explicitly request preparing a statement:
Scoped Prepare
In this approach prepared statements are tied to their query and its scope.
This assumes that
better-sqlite3
or whatever provider is used does the same. Currently I found no evidence of this being the case.sqlite3_finalize
should be called to free the memory from a prepared statement. From looking at the code I've found that it is called only when usingexec
, closing the database or incorrectly suppling multiple queries. So, maybe SQLite does some kind of automatic cleanup? Or maybe the memory impact from the prepared statements is so negligible it can be overlooked? If this is the case, the next approach might be favoured over this one.SQL Cache Map
Another thing we can do is to maintain the cache map of all the used queries. This way we can quickly lookup a correct prepared statement from the cache to be used across multiple scopes.
Let's look at a common way people write their DB functions:
Note, that here the query is always in its own scope and is recreated on every call. Putting stuff in the global scope can get really ugly really fast.
In this case having a global cache that is handled internally by
typed-sql
can bring huge benefits.What shall we do to satisfy efficient requirement then? If the cache is global when (or should at all) we evict it? Maybe LRU or MFU or similar approaches would be appropriate here?
Observable Parameters
Given the current state of JS, observable and reactive patterns are everywhere and for a good reason. We need to think how we can integrate this behaviour with query parameters. This needs to be framework/library agnostic, yet we don't want to write our own RxJS just for this features.
I think we should integrate with the most common observable interfaces without implementing our own (similar to what we do with coercers). Some implementations we should definitely consider supporting: Solid Stores, Svelte Stores, RxJS observables, plain JS functions and maybe there is something similar in React?
This declarative approach would play much nicer than the imperative call to the
.bind
function in many modern scenarios. We will support.bind
though. The observables would be in addition to that.So, @tantaman, what do you think? Maybe we should invite more people to discuss this?
The text was updated successfully, but these errors were encountered: