Issue 275: Thow exception on infinite loops #425

WilliamParker · 2019-03-11T23:47:53Z

…s not addressed yet.

This is still a fairly rough draft at this point. Please don't merge it; I expect the durability tests to fail. I wanted to give some visibility into my thinking here though.

Non-exhaustive lists of items for consideration:

The value of the :max-cycles setting is just a number, which means we could serialize it along with the session in durability. Should we? Pro: it is one less thing for a user to think about upon deserialization. Con: The Fressian implementation doesn't serialize other options that can't be reliably serialized. Also this option doesn't impact the semantics of the ruleset, while other options do, so it arguably is different in kind.
What should the default value be?

…not addressed yet.

… thrown on infinite loops

WilliamParker · 2019-03-18T22:51:24Z

After some reflection I think the :max-cycles setting actually makes more sense as an option to fire-rules. We already added options to fire-rules for issue 249. The maximum cycles that are plausible is obviously partly dependent on the rules in question, but the specific input data and environment where the rules are run (i.e. prod, dev, or anything that impacts the cost of a false positive versus a false negative in detection of infinite cycles) are also important. If users want to have a session always have the same max-cycles option for all fire-rules calls it should be readily possible for them to wrap the fire-rules function to always have the option given.

WilliamParker · 2019-03-18T22:52:22Z

As usual I'll want to take a second look with fresh eyes before merging, but this can be reviewed as a candidate PR for merge now. @mrrodriguez @EthanEChristian

mrrodriguez

I've only done a fast preliminary review. I am posting since I had a few questions right away.

mrrodriguez · 2019-03-18T23:52:07Z

src/main/clojure/clara/rules/engine.cljc

+                                  []
+                                  (l/get-children p-listener)))
+                              (catch #?(:clj Exception :cljs :default)
+                                  listener-exception


Indentation makes this confusing.

How would you indent it? This looks fairly normal to me but open to suggestions.

(catch #?(:clj Exception :cljs :default) listener-exception listener-exception)

is probably how i would write it, but that just because i am used to seeing:

(try <some logic> (catch Exception ex <do something>))

though the reader conditional makes this a bit different.

I think it was weird to see the first listener-exception on the next line. It just makes it blend with the body.
I don't consider it to be a critical point though if others thing it is fine how it is. I know reader-conditionals can make things get a bit weird.

mrrodriguez · 2019-03-18T23:53:11Z

src/main/clojure/clara/rules/engine.cljc

@@ -1817,13 +1835,13 @@
                  (when (some-> node :production :props :no-loop)
                    (flush-updates *current-session*)))))

-            (recur (mem/next-activation-group transient-memory) next-group)))
+            (recur (mem/next-activation-group transient-memory) next-group flushed-updates-count)))


Why no inc this time?'
Should it be set back to 0?

My reasoning here was that this path is reached every time a :no-loop rule is fired, while the other paths where I incremented the count are only reached upon the end of an activation group. I expect activation group switches to be much less numerous than individual rules firing, and I thought the risk of throwing in ultimately terminating cases here to be too high.

Ok, that explanation makes sense. However, you may want to note that in a comment above this since I think it could be hard to understand looking at it later.

Is this easier to understand now @mrrodriguez ? 13ef0df

mrrodriguez · 2019-03-18T23:54:35Z

src/main/clojure/clara/rules/engine.cljc

@@ -1867,7 +1885,8 @@
  (fire-rules [session opts]

    (let [transient-memory (mem/to-transient memory)
-          transient-listener (l/to-transient listener)]
+          transient-listener (l/to-transient listener)
+          max-cycles (get opts :max-cycles 600000)]


I'm not sure on the default chosen here. It could be breaking.

I haven't thought it completely through, but is the counting no specifically counting cycles of rules happening, but instead just steps happening during fire-rules? Perhaps that is not granular enough.

The count is intended to be incremented each time an activation group is fully fired. I'm expecting that to be much smaller than the number of times a rule is fired, and so safer in terms of avoiding false positives and incorrectly throwing.

I'll have to post more details later on how I chose the count and maybe change it - I measured the count for 30 seconds of a simple case on my device machine. Open to discussion on what the count should be.

I think I agree with Mike here:

I'm not sure on the default chosen here. It could be breaking.

Given the fabricated set of rules

(ns clara.loop-detection (:require [clojure.test :refer :all] [clara.rules :as r] [clara.rules.accumulators :as acc])) (defrecord Total [x]) (defrecord Subs [y]) (defrecord Sum [z]) (r/defrule sum-rule [Total (= ?total x)] [?s <- (acc/sum :y) :from [Subs]] [:test (> ?total ?s)] => (r/insert! (->Sum (- ?total ?s)))) (r/defrule subs-rule [Sum] => (r/insert-unconditional! (->Subs 1))) (r/defquery sum-query [] [?s <- (acc/count) :from [Subs]]) (-> (r/mk-session) (r/insert (->Total 600000)) (r/fire-rules) (r/query sum-query))

This will break due to loop detection, when in fact it is a finite execution. While I won't expect to find this exact pattern in rules that we(cerner) have, I wouldn't be surprised if someone was doing something similar or using unconditional inserts to break loops.

On one hand, i would think that the addition of this would maintain the previous functionality of the rules engine by default(loop forever, or till done), but on the other I understand that someone just starting clara might not be aware of the functionality thus miss out on the detection option.

Perhaps a large default is correct, but also allowing a sentinel value to disable loop detection to maintain passivity.

My reasoning here was that it just didn't seem likely to hit it, but I see your points that it is always going to be possible and there is enough usage of Clara for odd use cases like this to exist.

How would we feel about something like the following?

Having the limit be configurable, as is. We could revisit the limit as well.

Making the behaviour on reaching the limit configurable, with the options being to throw an exception, log a warning, or do nothing, with the log option being the default. If a warning just went to standard out that would probably take care of most of the confusion on the part of users who misunderstand looping behaviour in Clara, while being less likely to cause problems for users than simply throwing an exception. That said, the problems wouldn't go down to zero since if it bubbled up to the end-user (not a developer) in an app of some kind it could be confusing, as well as if Clara was in an app that had specific requirements around printing to standard out.

In the future possibly adding a comprehensive "development mode" that would do things like enable throwing here, enable tracing, etc. This would be an easier thing to explain in the docs than lots of different options requiring an understanding of Clara to make good choices.

If we go down the logging path, tools.logging might be an option. I'm not delighted about adding dependencies, but it looks pretty lightweight and has no dependencies itself other than Clojure. In essence it inspects the environment at runtime and uses whatever JVM logging framework is available, falling back to java.util.logging if nothing else is available. In ClojureScript I think we'd probably need to just print to the console, but I don't think the JS console is exposed to website users who aren't using developer tools so perhaps that isn't a major issue.

Sorry to take so long to respond here... I've been a bit preoccupied lately.

Thoughts @EthanEChristian @mrrodriguez ?

I think that approach seems reasonable to me, particularly:

Making the behavior on reaching the limit configurable

I feel like this allows us to have this functionality and still maintain passivity for consumers that have "odd" rules.

I don't think I have any objections to the addition of logging to clara's dependencies.

@WilliamParker I like the default to be logging, however I don't think this should require a new dependency.

I think it may actually be best to expose 2 args for this feature:

max-cycles - what you already have
max-cycles-exceeded-fn - tentative name, but it gives the consumer the ability to decide what to do

If they do not provide anything, perhaps we do a "print once" default fn. We can also provide a fn that can be used to throw the exception if that is what someone wants. These can be exposed as plain fn's or "special" keywords the argument accepts.

:max-cycles 60000 :max-cycles-exceeded-fn r/max-cycles-exceeded-print-once ;; Alternatives: ;; Special kw :clara-rules/max-cycles-exceeded-print-once ;; r/max-cycles-exceeded-throw

If no :max-cycles-exceeded-fn is given, use a default.

If we go this path, we should consider what args to this fn would be useful.

I think giving a handler to the consumer for cases like this is a common clj pattern.

@WilliamParker @EthanEChristian

Per an offline conversation with @mrrodriguez, for my use-case the addition of the dependency wouldn't be an issue. After that conversation, I think that i can agree with Mike's hesitance on clara mandating a logger on it's consumers.

Thanks for summarizing @EthanEChristian

I'll re-iterate my concern is that I'm not a particular fan of adding dependencies in general in a general-use library - unless they are providing enough value to justify. Logging in a single case doesn't seem too compelling in that area.

Also, I know that org.clojure/tools.logging is quite flexible (I use it fairly often), but it can still lead to issues when added to any given classpath. It does do some implicit logger impl searching/loading and I think it could get particularly awkward if someone was using a logger framework that doesn't fit into org.clojure/tools.logging's pattern at all. It may end up logging to it's own logger instead of their chosen one/may be annoying to try to silence or integrate with.

OK it sounds like there is a consensus for a default behaviour of printing to standard out then. I'll proceed with that.

mrrodriguez · 2019-03-18T23:55:33Z

src/main/clojure/clara/tools/testing_utils.cljc

+                        []
+                        t)]
+       (into []
+             (comp (map ex-data)


I think you can just use (keep ex-data) here and remove the comp and filter line.

I wanted to filter out empty but non-nil exception data too, but I agree that using keep can simplify this a bit.

mrrodriguez · 2019-04-22T15:05:52Z

src/main/clojure/clara/rules.cljc

@@ -35,6 +35,12 @@

   This take an additional map of options as a second argument.  Current options:

+   :max-cycles (positive integer): Each time Clara determines that there are no more insertions process in a given activation group (rules with the same salience)
+    it records that one cycle of processing rules has been performed.  When the max-cycles limit is reached Clara throws an exception indicating that the rules
+    are in an infinite loop.  This exception can be regarded as analogous to a StackOverflowError.  The default is 600000, which is a level at which it is unlikely that


It really is indicating that the rules may be in an infinite loop. I think that it'd be better stated that way.

mrrodriguez · 2019-04-22T15:08:27Z

src/main/clojure/clara/rules/engine.cljc

+                                  []
+                                  (l/get-children p-listener)))
+                              (catch #?(:clj Exception :cljs :default)
+                                  listener-exception


I think it was weird to see the first listener-exception on the next line. It just makes it blend with the body.
I don't consider it to be a critical point though if others thing it is fine how it is. I know reader-conditionals can make things get a bit weird.

mrrodriguez · 2019-04-22T15:09:15Z

src/main/clojure/clara/rules/engine.cljc

@@ -1817,13 +1835,13 @@
                  (when (some-> node :production :props :no-loop)
                    (flush-updates *current-session*)))))

-            (recur (mem/next-activation-group transient-memory) next-group)))
+            (recur (mem/next-activation-group transient-memory) next-group flushed-updates-count)))


Ok, that explanation makes sense. However, you may want to note that in a comment above this since I think it could be hard to understand looking at it later.

mrrodriguez · 2019-04-22T15:20:15Z

src/main/clojure/clara/rules/engine.cljc

@@ -1867,7 +1885,8 @@
  (fire-rules [session opts]

    (let [transient-memory (mem/to-transient memory)
-          transient-listener (l/to-transient listener)]
+          transient-listener (l/to-transient listener)
+          max-cycles (get opts :max-cycles 600000)]


@WilliamParker I like the default to be logging, however I don't think this should require a new dependency.

I think it may actually be best to expose 2 args for this feature:

max-cycles - what you already have
max-cycles-exceeded-fn - tentative name, but it gives the consumer the ability to decide what to do

If they do not provide anything, perhaps we do a "print once" default fn. We can also provide a fn that can be used to throw the exception if that is what someone wants. These can be exposed as plain fn's or "special" keywords the argument accepts.

:max-cycles 60000 :max-cycles-exceeded-fn r/max-cycles-exceeded-print-once ;; Alternatives: ;; Special kw :clara-rules/max-cycles-exceeded-print-once ;; r/max-cycles-exceeded-throw

If no :max-cycles-exceeded-fn is given, use a default.

If we go this path, we should consider what args to this fn would be useful.

I think giving a handler to the consumer for cases like this is a common clj pattern.

@WilliamParker @EthanEChristian

…ual rule activations

WilliamParker · 2019-07-05T00:33:24Z

Travis seems to be confused because I pushed the branch to cerner/clara-rules first, realized my mistake, then pushed it to my fork instead, but failed to build the branch that I deleted. A build on the PR did succeed though.

WilliamParker · 2020-06-29T22:11:08Z

This has been superseded by a listener-based approach in #447

Closing without merge.

Issue 275: First draft of exception on infinite loops. Durability is …

24e17a3

…not addressed yet.

WilliamParker requested review from mrrodriguez and EthanEChristian March 12, 2019 00:00

WilliamParker added 2 commits March 18, 2019 21:10

Make max-cycles a fire-rules option and add listeners to the data map…

9d9e9cd

… thrown on infinite loops

Add more tests, including tests where the rules have salience

d5ade2b

WilliamParker changed the title ~~Issue 275: First draft of exception on infinite loops. Durability i…~~ Issue 275: Thow exception on infinite loops Mar 18, 2019

mrrodriguez reviewed Mar 18, 2019

View reviewed changes

mrrodriguez reviewed Apr 22, 2019

View reviewed changes

Clarify that the counting is of activation group changes, not individ…

13ef0df

…ual rule activations

mrrodriguez assigned WilliamParker Oct 24, 2019

mrrodriguez added enhancement error-reporting labels Oct 24, 2019

WilliamParker force-pushed the issue275 branch from aca13ad to 13ef0df Compare October 26, 2019 13:27

WilliamParker mentioned this pull request Jan 13, 2020

Issue 275: Alternative approach using a listener #447

Closed

mrrodriguez mentioned this pull request Jan 23, 2020

warning the user at compile time about the unused bindings #448

Open

WilliamParker closed this Jun 29, 2020

WilliamParker mentioned this pull request Aug 24, 2020

Issue 275: Max-cycles listener using a listener-based approach #458

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue 275: Thow exception on infinite loops #425

Issue 275: Thow exception on infinite loops #425

WilliamParker commented Mar 11, 2019

WilliamParker commented Mar 18, 2019

WilliamParker commented Mar 18, 2019

mrrodriguez left a comment

mrrodriguez Mar 18, 2019

WilliamParker Mar 20, 2019

EthanEChristian Mar 21, 2019

mrrodriguez Apr 22, 2019

mrrodriguez Mar 18, 2019

WilliamParker Mar 20, 2019

mrrodriguez Apr 22, 2019

WilliamParker Jul 5, 2019

mrrodriguez Mar 18, 2019

WilliamParker Mar 20, 2019

WilliamParker Mar 20, 2019

EthanEChristian Mar 21, 2019

WilliamParker Apr 3, 2019 •

edited

Loading

EthanEChristian Apr 22, 2019

mrrodriguez Apr 22, 2019

EthanEChristian Apr 22, 2019

mrrodriguez Apr 22, 2019

WilliamParker Nov 5, 2019

mrrodriguez Mar 18, 2019

WilliamParker Mar 20, 2019

mrrodriguez Apr 22, 2019

mrrodriguez Apr 22, 2019

mrrodriguez Apr 22, 2019

mrrodriguez Apr 22, 2019

WilliamParker commented Jul 5, 2019

WilliamParker commented Jun 29, 2020

Issue 275: Thow exception on infinite loops #425

Issue 275: Thow exception on infinite loops #425

Conversation

WilliamParker commented Mar 11, 2019

WilliamParker commented Mar 18, 2019

WilliamParker commented Mar 18, 2019

mrrodriguez left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

WilliamParker Apr 3, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

WilliamParker commented Jul 5, 2019

WilliamParker commented Jun 29, 2020

WilliamParker Apr 3, 2019 •

edited

Loading