-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rules [re]run before session updates #469
Comments
@polvoblanco,
It looks like you are inserting duplicates. I don't believe that Clara makes any provisions to deduplicate similar facts being inserted, as duplicate facts might be intended by consumers. |
Hi, thanks for the quick response. The data that is passed into Clara will contain duplicates, obviously in the above example they are redundant but that was created purely to show the problem, in the real system they are taken from much larger records where other parts of the data are distinct. The issue I'm having is that the value is inserted unconditionally into ProcData and that is then used to prevent the rule either firing (in the first example) or re-inserting (although the RHS will fire) in the second example. The second example I feel is more telling as it shows that when the rule is run for the second, third, forth... time the records have not been updated from the previous runs. Thanks again, Paul |
There is no attempt by the rules engine to determine what a "duplicate is" and if that does or doesn't make sense for your domain. I also, am generally not a fan of using That said. A common pattern to "remove duplicates" is to accept that there are rule matches that result in duplicates possibly from the input data. So have a rule that creates the possible duplicates use some "intermediate type fact" that is then accumulated by a second rule. The accumulation rule can then choose how to take possible duplicates and "narrow it" down to a single fact that other rules can use - duplicate free. This essentially means that the rule writer decides what a duplicate means and how to "de-duplicate" it if they have that need (eg. the "cardinality of matches" isn't important for their rule processing). Here is rough example (note I haven't ran it, so sorry if there are small issues): (defrecord RawData [value])
(defrecord ProcData [value])
;; "intermediate fact" that may have "duplicates"
(defrecord ProcDataMatch [value])
(defrule proc-data-match
[RawData (= ?value value)]
;; Some criteria to join/enhance a `RawData` into a (possibly duplicated ProcDataMatch)
=>
(insert! (->ProcDataMatch ?value)))
(defrule proc-data
[?proc-matches <- (acc/all) :from [ProcDataMatch (= ?value value)]]
=>
;; Assuming any `ProcDataMatch` with the same `value` can be treated the same, just use `first` to
;; take one.
(insert! (->ProcData (first ?proc-matches))))
(comment
(-> (mk-session 'clara-test.core)
(insert-all (mapv #(->RawData %) ["ab"
"abc"
"abcd"
"abc"
"abcde"
"abcd"
"abcdef"
"abc"
"abcdef"
"abcdef"
"abcdef"]))
(insert (->ProcData "abc"))
(fire-rules))
) |
Hi, thanks @mrrodriguez that does the trick, guess I still have a lot more to get my head around. Cheers, Paul |
Hi,
It appears that rules are run multiple times before insertions made in a previous run have had time to impact the conditionals.
Below is a simple example that shows, despite conditions that should prevent the rule from running more than once, the rules run every time:
The results I am getting are:
"SET" "ab"
"SET" "abcdef"
"SET" "abcde"
"SET" "abcd"
"SET" "abcd"
"SET" "abcdef"
"SET" "abcdef"
"SET" "abcdef"
Where I wold expect to get:
"SET" "ab"
"SET" "abcdef"
"SET" "abcde"
"SET" "abcd"
I have also tried it with the rule:
And get:
"SET2" "ab"
"SET2" "abcdef"
"SET2" "abcdef"
"SET2" "abcdef"
"SET2" "abcdef"
"SET2" "abcd"
"SET2" "abcde"
"SET2" "abcd"
Again, I would not expect to see any duplicates. So, it seems that despite the RHS inserting a record, that insertion is not present by the time the rule runs again.
Any ideas?
Thanks,
Paul
The text was updated successfully, but these errors were encountered: