-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue 275: Thow exception on infinite loops #425
Changes from 3 commits
24e17a3
9d9e9cd
d5ade2b
13ef0df
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1719,7 +1719,7 @@ | |
|
||
(defn fire-rules* | ||
"Fire rules for the given nodes." | ||
[rulebase nodes transient-memory transport listener get-alphas-fn update-cache] | ||
[rulebase nodes transient-memory transport listener get-alphas-fn update-cache max-cycles] | ||
(binding [*current-session* {:rulebase rulebase | ||
:transient-memory transient-memory | ||
:transport transport | ||
|
@@ -1729,7 +1729,25 @@ | |
:listener listener}] | ||
|
||
(loop [next-group (mem/next-activation-group transient-memory) | ||
last-group nil] | ||
last-group nil | ||
flushed-updates-count 0] | ||
|
||
(when (> flushed-updates-count max-cycles) | ||
(throw (ex-info | ||
(str "It appears that the rules are in an infinite loop." | ||
" Incorrect use of truth maintenance is a frequent source of such loops; see the website." | ||
" If the rules are not in fact in a loop it is possible to increase the ceiling by use of the :max-cycles setting." | ||
" See the clara.rules/fire-rules docstring for details.") | ||
{:clara-rules/infinite-loop-suspected true | ||
:listeners (try | ||
(let [p-listener (l/to-persistent! listener)] | ||
(if (l/null-listener? p-listener) | ||
[] | ||
(l/get-children p-listener))) | ||
(catch #?(:clj Exception :cljs :default) | ||
listener-exception | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Indentation makes this confusing. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How would you indent it? This looks fairly normal to me but open to suggestions. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
is probably how i would write it, but that just because i am used to seeing:
though the reader conditional makes this a bit different. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it was weird to see the first |
||
listener-exception))}))) | ||
|
||
|
||
(if next-group | ||
|
||
|
@@ -1739,7 +1757,7 @@ | |
;; group before continuing. | ||
(do | ||
(flush-updates *current-session*) | ||
(recur (mem/next-activation-group transient-memory) next-group)) | ||
(recur (mem/next-activation-group transient-memory) next-group (inc flushed-updates-count))) | ||
|
||
(do | ||
|
||
|
@@ -1817,13 +1835,13 @@ | |
(when (some-> node :production :props :no-loop) | ||
(flush-updates *current-session*))))) | ||
|
||
(recur (mem/next-activation-group transient-memory) next-group))) | ||
(recur (mem/next-activation-group transient-memory) next-group flushed-updates-count))) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why no There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My reasoning here was that this path is reached every time a :no-loop rule is fired, while the other paths where I incremented the count are only reached upon the end of an activation group. I expect activation group switches to be much less numerous than individual rules firing, and I thought the risk of throwing in ultimately terminating cases here to be too high. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok, that explanation makes sense. However, you may want to note that in a comment above this since I think it could be hard to understand looking at it later. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this easier to understand now @mrrodriguez ? 13ef0df |
||
|
||
;; There were no items to be activated, so flush any pending | ||
;; updates and recur with a potential new activation group | ||
;; since a flushed item may have triggered one. | ||
(when (flush-updates *current-session*) | ||
(recur (mem/next-activation-group transient-memory) next-group)))))) | ||
(recur (mem/next-activation-group transient-memory) next-group (inc flushed-updates-count))))))) | ||
|
||
(deftype LocalSession [rulebase memory transport listener get-alphas-fn pending-operations] | ||
ISession | ||
|
@@ -1867,7 +1885,8 @@ | |
(fire-rules [session opts] | ||
|
||
(let [transient-memory (mem/to-transient memory) | ||
transient-listener (l/to-transient listener)] | ||
transient-listener (l/to-transient listener) | ||
max-cycles (get opts :max-cycles 600000)] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure on the default chosen here. It could be breaking. I haven't thought it completely through, but is the counting no specifically counting cycles of rules happening, but instead just steps happening during fire-rules? Perhaps that is not granular enough. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The count is intended to be incremented each time an activation group is fully fired. I'm expecting that to be much smaller than the number of times a rule is fired, and so safer in terms of avoiding false positives and incorrectly throwing. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'll have to post more details later on how I chose the count and maybe change it - I measured the count for 30 seconds of a simple case on my device machine. Open to discussion on what the count should be. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think I agree with Mike here:
Given the fabricated set of rules
This will break due to loop detection, when in fact it is a finite execution. While I won't expect to find this exact pattern in rules that we(cerner) have, I wouldn't be surprised if someone was doing something similar or using unconditional inserts to break loops. On one hand, i would think that the addition of this would maintain the previous functionality of the rules engine by default(loop forever, or till done), but on the other I understand that someone just starting clara might not be aware of the functionality thus miss out on the detection option. Perhaps a large default is correct, but also allowing a sentinel value to disable loop detection to maintain passivity. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My reasoning here was that it just didn't seem likely to hit it, but I see your points that it is always going to be possible and there is enough usage of Clara for odd use cases like this to exist. How would we feel about something like the following?
If we go down the logging path, tools.logging might be an option. I'm not delighted about adding dependencies, but it looks pretty lightweight and has no dependencies itself other than Clojure. In essence it inspects the environment at runtime and uses whatever JVM logging framework is available, falling back to java.util.logging if nothing else is available. In ClojureScript I think we'd probably need to just print to the console, but I don't think the JS console is exposed to website users who aren't using developer tools so perhaps that isn't a major issue. Sorry to take so long to respond here... I've been a bit preoccupied lately. Thoughts @EthanEChristian @mrrodriguez ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think that approach seems reasonable to me, particularly:
I feel like this allows us to have this functionality and still maintain passivity for consumers that have "odd" rules. I don't think I have any objections to the addition of logging to clara's dependencies. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @WilliamParker I like the default to be logging, however I don't think this should require a new dependency. I think it may actually be best to expose 2 args for this feature:
If they do not provide anything, perhaps we do a "print once" default fn. We can also provide a fn that can be used to throw the exception if that is what someone wants. These can be exposed as plain fn's or "special" keywords the argument accepts. :max-cycles 60000
:max-cycles-exceeded-fn r/max-cycles-exceeded-print-once
;; Alternatives:
;; Special kw :clara-rules/max-cycles-exceeded-print-once
;; r/max-cycles-exceeded-throw If no If we go this path, we should consider what args to this fn would be useful. I think giving a handler to the consumer for cases like this is a common clj pattern. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Per an offline conversation with @mrrodriguez, for my use-case the addition of the dependency wouldn't be an issue. After that conversation, I think that i can agree with Mike's hesitance on clara mandating a logger on it's consumers. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for summarizing @EthanEChristian I'll re-iterate my concern is that I'm not a particular fan of adding dependencies in general in a general-use library - unless they are providing enough value to justify. Logging in a single case doesn't seem too compelling in that area. Also, I know that There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK it sounds like there is a consensus for a default behaviour of printing to standard out then. I'll proceed with that. |
||
|
||
(if-not (:cancelling opts) | ||
;; We originally performed insertions and retractions immediately after the insert and retract calls, | ||
|
@@ -1912,7 +1931,8 @@ | |
transport | ||
transient-listener | ||
get-alphas-fn | ||
(uc/get-ordered-update-cache))) | ||
(uc/get-ordered-update-cache) | ||
max-cycles)) | ||
|
||
#?(:cljs (throw (ex-info "The :cancelling option is not supported in ClojureScript" | ||
{:session session :opts opts})) | ||
|
@@ -1961,7 +1981,8 @@ | |
get-alphas-fn | ||
;; This continues to use the cancelling cache after the first batch of insertions and retractions. | ||
;; If this is suboptimal for some workflows we can revisit this. | ||
update-cache))))) | ||
update-cache | ||
max-cycles))))) | ||
|
||
(LocalSession. rulebase | ||
(mem/to-persistent! transient-memory) | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -152,3 +152,20 @@ | |
(str "Actual mean value: " mean)) | ||
{:mean mean | ||
:std std})) | ||
|
||
#?(:clj | ||
(defn ex-data-maps | ||
"Given a Throwable, return in order the ExceptionInfo data maps for all items in the | ||
chain that implement IExceptionInfo and have nonempty data maps." | ||
[t] | ||
(let [throwables ((fn append-self | ||
[prior t1] | ||
(if t1 | ||
(append-self (conj prior t1) (.getCause ^Throwable t1)) | ||
prior)) | ||
[] | ||
t)] | ||
(into [] | ||
(comp (map ex-data) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think you can just use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wanted to filter out empty but non-nil exception data too, but I agree that using keep can simplify this a bit. |
||
(filter not-empty)) | ||
throwables)))) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,173 @@ | ||
(ns clara.test-infinite-loops | ||
(:require [clojure.test :refer :all] | ||
[clara.rules :refer :all] | ||
[clara.rules.testfacts :refer [->Cold ->Hot ->First ->Second]] | ||
[clara.tools.testing-utils :refer [def-rules-test | ||
ex-data-maps | ||
side-effect-holder-fixture | ||
side-effect-holder]] | ||
[clara.test-rules :refer [assert-ex-data]] | ||
[clara.tools.tracing :as tr] | ||
[clara.rules.accumulators :as acc]) | ||
(:import [clara.rules.testfacts Cold Hot First Second] | ||
[clara.tools.tracing | ||
PersistentTracingListener])) | ||
|
||
(use-fixtures :each side-effect-holder-fixture) | ||
|
||
(def-rules-test test-truth-maintenance-loop | ||
|
||
;; Test of an infinite loop to an endless cycle caused by truth maintenance, | ||
;; that is an insertion that causes its support to be retracted. | ||
|
||
{:rules [hot-rule [[[:not [Hot]]] | ||
(insert! (->Cold nil))] | ||
|
||
cold-rule [[[Cold]] | ||
(insert! (->Hot nil))]] | ||
|
||
:sessions [empty-session [hot-rule cold-rule] {}]} | ||
|
||
(assert-ex-data {:clara-rules/infinite-loop-suspected true} (fire-rules empty-session {:max-cycles 3000}))) | ||
|
||
(def-rules-test test-truth-maintenance-loop-with-salience | ||
|
||
;; Test of an infinite loop to an endless cycle caused by truth maintenance, | ||
;; that is an insertion that causes its support to be retracted, when the | ||
;; two rules that form the loop have salience and are thus parts of different | ||
;; activation groups. | ||
|
||
|
||
{:rules [hot-rule [[[:not [Hot]]] | ||
(insert! (->Cold nil)) | ||
{:salience 1}] | ||
|
||
cold-rule [[[Cold]] | ||
(insert! (->Hot nil)) | ||
{:salience 2}]] | ||
|
||
:sessions [empty-session [hot-rule cold-rule] {} | ||
empty-session-negated-salience [hot-rule cold-rule] {:activation-group-fn (comp - :salience :props)}]} | ||
|
||
(doseq [session [empty-session empty-session-negated-salience]] | ||
;; Validate that the results are the same in either rule ordering. | ||
(assert-ex-data {:clara-rules/infinite-loop-suspected true} (fire-rules session {:max-cycles 3000})))) | ||
|
||
|
||
(def-rules-test test-recursive-insertion | ||
|
||
;; Test of an infinite loop due to runaway insertions without retractions. | ||
|
||
{:rules [cold-rule [[[Cold]] | ||
(insert! (->Cold "ORD"))]] | ||
|
||
:sessions [empty-session [cold-rule] {}]} | ||
|
||
(assert-ex-data {:clara-rules/infinite-loop-suspected true} | ||
(-> empty-session | ||
(insert (->Cold "ARN")) | ||
;; Use a small value here to ensure that we don't run out of memory before throwing. | ||
;; There is unfortunately a tradeoff where making the default number of cycles allowed | ||
;; high enough for some use cases will allow others to create OutOfMemory errors in cases | ||
;; like these but we can at least throw when there is enough memory available to hold the facts | ||
;; created in the loop. | ||
(fire-rules {:max-cycles 10})))) | ||
|
||
(def-rules-test test-recursive-insertion-loop-no-salience | ||
|
||
{:rules [first-rule [[[First]] | ||
(insert! (->Second))] | ||
|
||
second-rule [[[Second]] | ||
(insert! (->First))]] | ||
|
||
:sessions [empty-session [first-rule second-rule] {}]} | ||
|
||
(assert-ex-data {:clara-rules/infinite-loop-suspected true} | ||
(-> empty-session | ||
(insert (->First)) | ||
(fire-rules {:max-cycles 10})))) | ||
|
||
(def-rules-test test-recursive-insertion-loop-with-salience | ||
|
||
{:rules [first-rule [[[First]] | ||
(insert! (->Second)) | ||
{:salience 1}] | ||
|
||
second-rule [[[Second]] | ||
(insert! (->First)) | ||
{:salience 2}]] | ||
|
||
:sessions [empty-session [first-rule second-rule] {} | ||
empty-session-negated-salience [first-rule second-rule] {:activation-group-fn (comp - :salience :props)}]} | ||
|
||
(doseq [session [empty-session empty-session-negated-salience]] | ||
(assert-ex-data {:clara-rules/infinite-loop-suspected true} | ||
(-> session | ||
(insert (->First)) | ||
(fire-rules {:max-cycles 10}))))) | ||
|
||
(def-rules-test test-tracing-infinite-loop | ||
|
||
{:rules [cold-rule [[[Cold]] | ||
(insert! (->Cold "ORD"))]] | ||
|
||
:sessions [empty-session [cold-rule] {}]} | ||
|
||
(try | ||
(do (-> empty-session | ||
tr/with-tracing | ||
(insert (->Cold "ARN")) | ||
;; Use a small value here to ensure that we don't run out of memory before throwing. | ||
;; There is unfortunately a tradeoff where making the default number of cycles allowed | ||
;; high enough for some use cases will allow others to create OutOfMemory errors in cases | ||
;; like these but we can at least throw when there is enough memory available to hold the facts | ||
;; created in the loop. | ||
(fire-rules {:max-cycles 10})) | ||
(is false "The infinite loop did not throw an exception.")) | ||
(catch Exception e | ||
(let [data-maps (ex-data-maps e) | ||
loop-data (filter :clara-rules/infinite-loop-suspected data-maps)] | ||
(is (= (count loop-data) | ||
1) | ||
"There should only be one exception in the chain from infinite rules loops.") | ||
(is (= (-> loop-data first :listeners count) 1) | ||
"There should only be one listener.") | ||
(is (-> loop-data first :listeners ^PersistentTracingListener first .-trace not-empty) | ||
"There should be tracing data available when a traced session throws an exception on an infinite loop."))))) | ||
|
||
(def-rules-test test-max-cycles-respected | ||
|
||
;; As the name suggests, this is a test to validate that setting the max-cycles to different | ||
;; values actually works. The others tests are focused on validating that infinite loops | ||
;; throw exceptions, and the max-cycles values there are simply set to restrict resource usage, | ||
;; whether CPU or memory, by the test. Here, on the other hand, we construct a rule where we control | ||
;; how many rule cycles will be executed, and then validate that an exception is thrown when that number is | ||
;; greater than the max-cycles and is not when it isn't. For good measure, we validate that the non-exception-throwing | ||
;; case has queryable output to make sure that the logic was actually run, not ommitted because of a typo or similar. | ||
|
||
{:rules [recursive-rule-with-end [[[First]] | ||
(when (< @side-effect-holder 20) | ||
(do | ||
(insert-unconditional! (->First)) | ||
(swap! side-effect-holder inc)))]] | ||
|
||
:queries [first-query [[] [[?f <- First]]]] | ||
|
||
:sessions [empty-session [recursive-rule-with-end first-query] {}]} | ||
|
||
|
||
(reset! side-effect-holder 0) | ||
|
||
(assert-ex-data {:clara-rules/infinite-loop-suspected true} | ||
(-> empty-session | ||
(insert (->First)) | ||
(fire-rules {:max-cycles 10}))) | ||
|
||
(reset! side-effect-holder 0) | ||
|
||
(is (= (count (-> empty-session | ||
(insert (->First)) | ||
(fire-rules {:max-cycles 30}) | ||
(query first-query))) | ||
21))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It really is indicating that the rules may be in an infinite loop. I think that it'd be better stated that way.