-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Solve the prepared statement automatic-release difficulty #159
Comments
Ok, so this is unfortunately stalled again for now. I think I could fix this, as I indicated above, by using a plain array, rejiggering things so allocations only occur upon registration or actual release, keeping duplicates out, and doing a bunch of string comparisons. But in terms of big-O, I don't like the algorithmic complexity that involves. Especially for something that seems to have questionable merit anyway. As above, associative arrays are out, because apperently, merely indexing them can somehow cause GC allocation (from automatic re-hashing, maybe?). So before I can move forward with this I'll need one of two things: A. The right data structure. It must have good time complexity for this use-case and never allocate on lookup. or better yet: B. Compelling evidence that releasing prepared statements within a connection's lifetime actually does have practical benefit (and that closing and re-opening the connection isn't good enough). |
Hm... you don't need the GC to store an array of integers (which is really all this is). Why not C malloc/realloc/free? In any case, there is a separate problem besides the nogc problem, the GC can be running in any thread, not just the one that allocated the connection. So you potentially could be running into race conditions. The way phobos solves this is... Don't put RC structs in heap objects :) Yeah, it's that dumb. But without help from the GC (setting thread affinity for a heap object for example), there's no good way to solve this without resorting to defensive locking for everything. My recommendation: just use a C array, and avoid the GC altogether, and warn people about the dangers of putting the RC object in a GC destructed heap object.
It's not the way it currently works, only adding or removing elements may use the GC. I suppose that's to allow for future implementation details. But in any case, using an AA I think is overkill. I can't imagine cases where you are going to have more than a few stale prepared statements. |
Ah, ok, I see. I guess in some ways that does simplify the problem, then.
True, but generally speaking, I don't like leaving open the possibility of pathological cases when I can avoid it. They just increase the number of dank, dark corners in a design. But, it may not matter anyway: With the Prepared redesign in #157, I already need to use an AA anyway to lookup per-connection prepared statement info (such as the statement ID handle). So I just tossed another field into the payload to indicate "queued for release?", and ripped out all the old statement queue stuff. Based on everything you've said above, it sounds like that (along with the "don't put RC structs in heap objects" warning) should be suficcient. |
Ooh, nice idea. You can store a pointer to the actual bool in the prepared statement object, then you don't need to bother with the AA from the |
I'm pretty sure this is a wontfix. The current system I have been using for a while, and it works awesomely. I don't think there's a good reason to proactively release prepared statements, but even if there is, it's possible to do it synchronously without RAII help. |
Note: This is not an especially high priority unless someone discovers a real need to release prepared statements from the server. Since the server treats prepared statements as being specific to a connection (and connections eventually end), this seems unlikely.
Suppose we want to use automatic resource management (ie, a scoped/RAII/refcounted/etc. wrapper) for
Prepared
to automatically release the statement from a connection when it's done being used (again, AFAIK, this shouldn't be necessary, but suppose we want to do it). This poses a problem:What if the prepared statement becomes ready for release while there is already data pending on the connection? That is, while a result set input range has not yet been completely read. (This has been demonstrated as a realistic possibility, by @schveiguy.)
Commands cannot be sent to the server while data is pending for retreival. The usual solution mysql-native takes (as of v1.1.4) is to auto-purge any pending data when a command needs to be issued. For the most part, this has worked out well, because:
But the problem is, if we use automatic resource management to release a prepared statement, then issuing the command "release this prepared statement" becomes implicit and hidden. In situations such as the demonstration mentioned above, this hidden auto-purge would cause surprising problems that can be difficult or messy to deal with.
In an attempt to avoid this problem, ever since v1.1.4, mysql-native has implemented a "queued for release" system. When mysql-native is told to release a prepared statement, it doesn't release it immediately. Instead, it saves it in a queue (but mysql-native still treats it as having been released, from the user code's perspecive). Then, mysql-native waits until the next time an explicit command is issued. At that point, any pending data is auto-purged and any prepared statements queued for release are finally released for real.
Unfortunately, this create a new problem when the cleanup happens to be triggered during a GC cycle. We don't know ahead of time how many prepared statements might need to be released. So queuing one for release involves appending an item (ie which statement to release) to an array. This may trigger an allocation, but...
Allocations cannot be performed during garbage collection (it will throw an InvalidMemoryOperationError). If the cleanup (and thus, queueing it for release) happens to be triggered during a GC cycle, boom: InvalidMemoryOperationError.
So this cycle needs to be broken. I'm hesistant to hardcode the size of the "to be released" queue, because for anything short of soaking up several gigabytes (ie, not realistic) it still leaves the difficult question of: What should happen when the hardcoded limit's exceeded? I'm not sure I like any of the possible ways to answer that question.
I was just going to post this as a "to be mulled over", but in the process of writing this I think I may have actually come up with a solution:
As long as we never queue duplicates, nor statements that haven't been registered on the given connection, then we're quaranteed the release queue will never exceed the number of statements actually registered. So if the queue is resized upon registration of prepared statements, then it should never need to allocate when queueing for release. Thus, marking it @nogc should be possible, thereby enabling safe, reliable automatic cleanup.
Also, switching the array to an AA should prevent O(n) complexity when checking the queue for duplicates.(no good: "indexing an associative array...may cause GC allocation")Hooray for pausing to take a step back.
Have I missed anything?
The text was updated successfully, but these errors were encountered: