-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue 268: Cache insertions and retractions until the rules are fired… #269
Conversation
@@ -59,10 +59,15 @@ | |||
session (-> empty-session | |||
(insert (->Temperature 30 "MCI")) | |||
(insert (->Temperature 10 "MCI")) | |||
(insert (->Temperature 80 "MCI"))) | |||
(insert (->Temperature 80 "MCI")) | |||
fire-rules) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a lot of tests to update to add a fire-rules call
do we have concerns around how this would impact consumers necessitating a call to fire rules?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Short version: I do have concerns, but the current behavior isn't reliable anyway as discussed in #268 (comment) specifically the first bullet, isn't a public contract, and simply changing it seems like the lesser of evils to me. Making using this sort of caching or non-caching a user-facing option would be one possibility, but I hesitate to permanently add the complication of another user-facing option solely for passivity when the original behavior was unreliable anyway, wasn't a part of the API contract, and the uplift for anyone who runs into the issue should be pretty trivial. That said, it is possible that others think differently/see other concerns which is a major reason why I solicited input on the parent issue #268.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, the behavior has never been reliable. I don't like adding more options and tricky semantics for things like this either really. If it is any consolation, prior to Drools 6, the entire engine was very eager in computing LHS logic and accumulators. When Drools 6 released, it went nearly the complete opposite direction and it was necessary to fire rules to observe almost any behavior.
Drools has a large user-base too. I didn't really observe too much drama around the change either. That's from my own experiences going through that upgrade too.
Firing rules is already the documented pattern in all of the examples for Clara (also was for Drools).
I don't like making "breaking" changes, but this is really undocumented and unreliable behavior prior to these changes. If anything, the documentation should make it clearly stated that prior to firing rules, the state of working memory is undefined. The engine reserves this as an implementation detail to facilitate better runtime optimizations.
|
||
min-retracted (retract session (->Temperature 10 "MCI")) | ||
max-retracted (retract session (->Temperature 80 "MCI"))] | ||
min-retracted (-> session |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there are no assertions in this test on this
assuming we'd want an assertion on min-retracted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are correct, looks like we don't have an assertion on that currently at https://github.com/cerner/clara-rules/blob/0.13.0/src/test/clojure/clara/test_accumulators.clj#L64 That should be fixed (it can be with this or another quick commit if we don't make these changes)
@@ -5187,3 +5260,30 @@ | |||
(fire-rules) | |||
(query temperature-query))))) | |||
|
|||
(deftest test-stored-insertion-retraction-ordering |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we add a more robust somewhere for these? testing multiple insertions/retractions prior to firing rules and then repeating multiple insertions and retractions is what I'd be looking for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK.
(external-retract-loop get-alphas-fn transient-memory transport transient-listener)) | ||
(let [new-pending-operations (conj pending-operations (->PendingUpdate :insertion (if (coll? facts) | ||
facts | ||
(into [] facts))))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the purpose of putting the facts
into a vector here? It seems like it could just end up being a waste of time. Are you concerned with the mutability of the collection? I'm not sure if that is a concern that should be paid for with a perf penalty like this. If the caller is mutating the same collection of facts
they passed to the session, the expected semantics would be mysterious anyways.
I'd say do seq
instead, but it'd still be affected by the backing collection beyond the first (32 element) chunk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are correct that I'm defending against a mutable collection here. Currently if a user passes in a mutable collection and then mutates it later it doesn't matter since Clara will insert/retract the facts in it before returning from the insert/retract call. Now that this is delayed mutation of the collection would change the behavior unless we specifically do something to be passive. Note that we only do this if (coll? facts) returns false. From the docs this should return true for any Clojure persistent collection. Basically, we'd only incur the copying costs if a user passes a non-Clojure collection; otherwise we'd just incur what amounts to an instance check. Particularly since Clara has a top-level Java API being passive to Java usage seems desirable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, my concern was from the Java side that this could end up being excessive deep-copying. Maybe it doesn't matter though.
;; but this is presumably less significant than the cost of memory transitions. | ||
;; | ||
;; We perform the insertions and retractions in the same order as they were applied to the session since | ||
;; if a fact is not in the session, retracted, and then subsequently inserted it should be in the session at |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dislike these semantics still. I think the order-dependence is brittle and it is limiting as far as engine optimizations. However, I accept you are just preserving what has been done before.
Just wanted to add a rant'y comment here. 😛
;; but this had the downside of making a pattern like "Retract facts, insert other facts, and fire the rules" | ||
;; perform at least three transitions between a persistent and transient memory. Delaying the actual execution | ||
;; of the insertions and retractions until firing the rules allows us to cut this down to a single transition | ||
;; between persistent and transient memory. There is some cost to the runtime dispatch on operation types here, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "runtime dispatch" cost could be minimized here. I don't think it is worth it right now though. It is probably very minimal in nearly all cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it would be almost nonexistent unless you were, say, inserting/retractions tons of facts one at a time which will already perform poorly anyway. It is basically just a keyword equality check. That can add up if you do enough of them; we've had cases where doing (identical? some-arg :some-keyword) rather than (= some-arg :some-keyword) in a hotspot sped things up considerably. I doubt this is a meaningful concern in this particular case though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well on the keyword equality thing, see #269 (comment)
case
would alleviate that anyways. It creates a constant time "jump table".
For clj keywords this actually ends up a clojure.lang.Util.hash(Object)
on the runtime objects being checked against the cases, followed by an identity-based check on the keywords objects for that hash if the hash matches.
So not as fast as just an identity check perhaps, but still reasonable.
I still agree it is a not terribly important thing. I just favor using case
when it is applicable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I commented below, but I have updated this to use case.
;; the end. | ||
(doseq [{op-type :type facts :facts} pending-operations] | ||
|
||
(condp = op-type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be faster as case
(case op-type
:insertion
<etc>)
case
is more restrictive than cond
or condp
, but I don't see any reason to not get the benefits out of it when it works (have compile-time constant literal dispatch values).
(doseq [[alpha-roots fact-group] (get-alphas-fn facts) | ||
root alpha-roots] | ||
(alpha-activate root fact-group transient-memory transport transient-listener)) | ||
(external-retract-loop get-alphas-fn transient-memory transport transient-listener))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any reason not to do the *pending-external-retractions*
along with the other given :retraction
s?
Perhaps it'd just be too tricky to implement for the sake of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly, along with various other changes here to group things together. However, we'd have to carefully evaluate the performance implications since you'd be altering how things propagate through the Rete network, potentially significantly. We'd also need to assess the implications on unconditional operations of such grouping. My intent was basically to preserve existing behavior for now apart from removing the memory persistent/transient transitions and any entirely new options added (such as the one discussed at #249). I don't see any current decisions with regards to the API that depend on the outcome of such optimizations and thus I'd prefer to defer them until a later time if we do want them.
+1 I'm certainly a big fan of this change. |
agreed +1 |
… to avoid unnecessary transitions between persistent and transient memory
600a85d
to
2698b6f
Compare
I have updated the pull request to
As regards #269 (comment) it looks there is some other cleanup to do in that file e.g. removing duplication. It also probably makes sense to have that be in a different commit since it is logically unrelated to this change. I thus don't think it should block this change although I'd like to address it in the near future. |
Also, I discussed this off-thread with @rbrush ; as of then he was pretty busy and hadn't had time to look at the details here but was OK with the general idea of the change and the requirement that users call fire-rules for querying to work. |
Looks good. |
… to avoid unnecessary transitions between persistent and transient memory