-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue 275: Thow exception on infinite loops #425
Conversation
…not addressed yet.
After some reflection I think the :max-cycles setting actually makes more sense as an option to fire-rules. We already added options to fire-rules for issue 249. The maximum cycles that are plausible is obviously partly dependent on the rules in question, but the specific input data and environment where the rules are run (i.e. prod, dev, or anything that impacts the cost of a false positive versus a false negative in detection of infinite cycles) are also important. If users want to have a session always have the same max-cycles option for all fire-rules calls it should be readily possible for them to wrap the fire-rules function to always have the option given. |
As usual I'll want to take a second look with fresh eyes before merging, but this can be reviewed as a candidate PR for merge now. @mrrodriguez @EthanEChristian |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've only done a fast preliminary review. I am posting since I had a few questions right away.
[] | ||
(l/get-children p-listener))) | ||
(catch #?(:clj Exception :cljs :default) | ||
listener-exception |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indentation makes this confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would you indent it? This looks fairly normal to me but open to suggestions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(catch #?(:clj Exception :cljs :default) listener-exception
listener-exception)
is probably how i would write it, but that just because i am used to seeing:
(try
<some logic>
(catch Exception ex
<do something>))
though the reader conditional makes this a bit different.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it was weird to see the first listener-exception
on the next line. It just makes it blend with the body.
I don't consider it to be a critical point though if others thing it is fine how it is. I know reader-conditionals can make things get a bit weird.
@@ -1817,13 +1835,13 @@ | |||
(when (some-> node :production :props :no-loop) | |||
(flush-updates *current-session*))))) | |||
|
|||
(recur (mem/next-activation-group transient-memory) next-group))) | |||
(recur (mem/next-activation-group transient-memory) next-group flushed-updates-count))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why no inc
this time?'
Should it be set back to 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My reasoning here was that this path is reached every time a :no-loop rule is fired, while the other paths where I incremented the count are only reached upon the end of an activation group. I expect activation group switches to be much less numerous than individual rules firing, and I thought the risk of throwing in ultimately terminating cases here to be too high.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, that explanation makes sense. However, you may want to note that in a comment above this since I think it could be hard to understand looking at it later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this easier to understand now @mrrodriguez ? 13ef0df
@@ -1867,7 +1885,8 @@ | |||
(fire-rules [session opts] | |||
|
|||
(let [transient-memory (mem/to-transient memory) | |||
transient-listener (l/to-transient listener)] | |||
transient-listener (l/to-transient listener) | |||
max-cycles (get opts :max-cycles 600000)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure on the default chosen here. It could be breaking.
I haven't thought it completely through, but is the counting no specifically counting cycles of rules happening, but instead just steps happening during fire-rules? Perhaps that is not granular enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The count is intended to be incremented each time an activation group is fully fired. I'm expecting that to be much smaller than the number of times a rule is fired, and so safer in terms of avoiding false positives and incorrectly throwing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll have to post more details later on how I chose the count and maybe change it - I measured the count for 30 seconds of a simple case on my device machine. Open to discussion on what the count should be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I agree with Mike here:
I'm not sure on the default chosen here. It could be breaking.
Given the fabricated set of rules
(ns clara.loop-detection
(:require [clojure.test :refer :all]
[clara.rules :as r]
[clara.rules.accumulators :as acc]))
(defrecord Total [x])
(defrecord Subs [y])
(defrecord Sum [z])
(r/defrule sum-rule
[Total (= ?total x)]
[?s <- (acc/sum :y) :from [Subs]]
[:test (> ?total ?s)]
=>
(r/insert! (->Sum (- ?total ?s))))
(r/defrule subs-rule
[Sum]
=>
(r/insert-unconditional! (->Subs 1)))
(r/defquery sum-query
[]
[?s <- (acc/count) :from [Subs]])
(-> (r/mk-session)
(r/insert (->Total 600000))
(r/fire-rules)
(r/query sum-query))
This will break due to loop detection, when in fact it is a finite execution. While I won't expect to find this exact pattern in rules that we(cerner) have, I wouldn't be surprised if someone was doing something similar or using unconditional inserts to break loops.
On one hand, i would think that the addition of this would maintain the previous functionality of the rules engine by default(loop forever, or till done), but on the other I understand that someone just starting clara might not be aware of the functionality thus miss out on the detection option.
Perhaps a large default is correct, but also allowing a sentinel value to disable loop detection to maintain passivity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My reasoning here was that it just didn't seem likely to hit it, but I see your points that it is always going to be possible and there is enough usage of Clara for odd use cases like this to exist.
How would we feel about something like the following?
- Having the limit be configurable, as is. We could revisit the limit as well.
- Making the behaviour on reaching the limit configurable, with the options being to throw an exception, log a warning, or do nothing, with the log option being the default. If a warning just went to standard out that would probably take care of most of the confusion on the part of users who misunderstand looping behaviour in Clara, while being less likely to cause problems for users than simply throwing an exception. That said, the problems wouldn't go down to zero since if it bubbled up to the end-user (not a developer) in an app of some kind it could be confusing, as well as if Clara was in an app that had specific requirements around printing to standard out.
- In the future possibly adding a comprehensive "development mode" that would do things like enable throwing here, enable tracing, etc. This would be an easier thing to explain in the docs than lots of different options requiring an understanding of Clara to make good choices.
If we go down the logging path, tools.logging might be an option. I'm not delighted about adding dependencies, but it looks pretty lightweight and has no dependencies itself other than Clojure. In essence it inspects the environment at runtime and uses whatever JVM logging framework is available, falling back to java.util.logging if nothing else is available. In ClojureScript I think we'd probably need to just print to the console, but I don't think the JS console is exposed to website users who aren't using developer tools so perhaps that isn't a major issue.
Sorry to take so long to respond here... I've been a bit preoccupied lately.
Thoughts @EthanEChristian @mrrodriguez ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that approach seems reasonable to me, particularly:
Making the behavior on reaching the limit configurable
I feel like this allows us to have this functionality and still maintain passivity for consumers that have "odd" rules.
I don't think I have any objections to the addition of logging to clara's dependencies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@WilliamParker I like the default to be logging, however I don't think this should require a new dependency.
I think it may actually be best to expose 2 args for this feature:
max-cycles
- what you already have
max-cycles-exceeded-fn
- tentative name, but it gives the consumer the ability to decide what to do
If they do not provide anything, perhaps we do a "print once" default fn. We can also provide a fn that can be used to throw the exception if that is what someone wants. These can be exposed as plain fn's or "special" keywords the argument accepts.
:max-cycles 60000
:max-cycles-exceeded-fn r/max-cycles-exceeded-print-once
;; Alternatives:
;; Special kw :clara-rules/max-cycles-exceeded-print-once
;; r/max-cycles-exceeded-throw
If no :max-cycles-exceeded-fn
is given, use a default.
If we go this path, we should consider what args to this fn would be useful.
I think giving a handler to the consumer for cases like this is a common clj pattern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per an offline conversation with @mrrodriguez, for my use-case the addition of the dependency wouldn't be an issue. After that conversation, I think that i can agree with Mike's hesitance on clara mandating a logger on it's consumers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for summarizing @EthanEChristian
I'll re-iterate my concern is that I'm not a particular fan of adding dependencies in general in a general-use library - unless they are providing enough value to justify. Logging in a single case doesn't seem too compelling in that area.
Also, I know that org.clojure/tools.logging
is quite flexible (I use it fairly often), but it can still lead to issues when added to any given classpath. It does do some implicit logger impl searching/loading and I think it could get particularly awkward if someone was using a logger framework that doesn't fit into org.clojure/tools.logging
's pattern at all. It may end up logging to it's own logger instead of their chosen one/may be annoying to try to silence or integrate with.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK it sounds like there is a consensus for a default behaviour of printing to standard out then. I'll proceed with that.
[] | ||
t)] | ||
(into [] | ||
(comp (map ex-data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can just use (keep ex-data)
here and remove the comp
and filter
line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to filter out empty but non-nil exception data too, but I agree that using keep can simplify this a bit.
@@ -35,6 +35,12 @@ | |||
|
|||
This take an additional map of options as a second argument. Current options: | |||
|
|||
:max-cycles (positive integer): Each time Clara determines that there are no more insertions process in a given activation group (rules with the same salience) | |||
it records that one cycle of processing rules has been performed. When the max-cycles limit is reached Clara throws an exception indicating that the rules | |||
are in an infinite loop. This exception can be regarded as analogous to a StackOverflowError. The default is 600000, which is a level at which it is unlikely that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It really is indicating that the rules may be in an infinite loop. I think that it'd be better stated that way.
[] | ||
(l/get-children p-listener))) | ||
(catch #?(:clj Exception :cljs :default) | ||
listener-exception |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it was weird to see the first listener-exception
on the next line. It just makes it blend with the body.
I don't consider it to be a critical point though if others thing it is fine how it is. I know reader-conditionals can make things get a bit weird.
@@ -1817,13 +1835,13 @@ | |||
(when (some-> node :production :props :no-loop) | |||
(flush-updates *current-session*))))) | |||
|
|||
(recur (mem/next-activation-group transient-memory) next-group))) | |||
(recur (mem/next-activation-group transient-memory) next-group flushed-updates-count))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, that explanation makes sense. However, you may want to note that in a comment above this since I think it could be hard to understand looking at it later.
@@ -1867,7 +1885,8 @@ | |||
(fire-rules [session opts] | |||
|
|||
(let [transient-memory (mem/to-transient memory) | |||
transient-listener (l/to-transient listener)] | |||
transient-listener (l/to-transient listener) | |||
max-cycles (get opts :max-cycles 600000)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@WilliamParker I like the default to be logging, however I don't think this should require a new dependency.
I think it may actually be best to expose 2 args for this feature:
max-cycles
- what you already have
max-cycles-exceeded-fn
- tentative name, but it gives the consumer the ability to decide what to do
If they do not provide anything, perhaps we do a "print once" default fn. We can also provide a fn that can be used to throw the exception if that is what someone wants. These can be exposed as plain fn's or "special" keywords the argument accepts.
:max-cycles 60000
:max-cycles-exceeded-fn r/max-cycles-exceeded-print-once
;; Alternatives:
;; Special kw :clara-rules/max-cycles-exceeded-print-once
;; r/max-cycles-exceeded-throw
If no :max-cycles-exceeded-fn
is given, use a default.
If we go this path, we should consider what args to this fn would be useful.
I think giving a handler to the consumer for cases like this is a common clj pattern.
…ual rule activations
Travis seems to be confused because I pushed the branch to cerner/clara-rules first, realized my mistake, then pushed it to my fork instead, but failed to build the branch that I deleted. A build on the PR did succeed though. |
aca13ad
to
13ef0df
Compare
This has been superseded by a listener-based approach in #447 Closing without merge. |
…s not addressed yet.
This is still a fairly rough draft at this point. Please don't merge it; I expect the durability tests to fail. I wanted to give some visibility into my thinking here though.
Non-exhaustive lists of items for consideration: