Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

graph.reachable_paths does not return all reachable paths #5871

Open
ericjkao opened this issue Apr 26, 2023 · 12 comments
Open

graph.reachable_paths does not return all reachable paths #5871

ericjkao opened this issue Apr 26, 2023 · 12 comments

Comments

@ericjkao
Copy link
Contributor

Short description

In OPA 0.51.0, the graph.reachable_paths builtin function does not seem to return all reachable paths or even all maximal reachable paths in many cases.

Steps To Reproduce

  1. Evaluate the following Rego package (or use this playground link):
package play
import future.keywords.if
import future.keywords.in

testing_graph:= {
  "1": [],
  "2": [
    "1",
  ],
  "3": [
    "2",
  ],
  "4": [
    "3",
    "2"
  ],
  "5": [
    "4",
  ]
}

paths := (graph.reachable_paths(testing_graph, ["5"]))
  1. See that paths evaluates to:
[ 
  ["5", "4"],
  ["5", "4", "3", "2", "1"]
]

Expected behavior

Based on the doc and the original PR, I would expect the path ["5", "4", "2", "1"] to be included.

Additional context

Seems to work for shorter paths:
https://play.openpolicyagent.org/p/fpy0cgnYjb

@ericjkao ericjkao added the bug label Apr 26, 2023
@ericjkao
Copy link
Contributor Author

Mentioning the authors of this builtin to see if they have thoughts or comments 😀 @justinlindh-wf @fredallen-wk

@stale
Copy link

stale bot commented May 31, 2023

This issue has been automatically marked as inactive because it has not had any activity in the last 30 days. Although currently inactive, the issue could still be considered and actively worked on in the future. More details about the use-case this issue attempts to address, the value provided by completing it or possible solutions to resolve it would help to prioritize the issue.

@stale stale bot added the inactive label May 31, 2023
@charlieegan3
Copy link
Contributor

I've been having a think about this one and it's not 100% clear what the 'correct' result for reachable_paths. Is it:

A) all paths from nodes in the initial set to leaf nodes?
B) all paths of any length from the initial set to any other node with no cycles?
C) or what we have now, paths discovered while visiting all reachable nodes?

B) feels like an improvement on C)'s results, while being worse on performance. I think that we could implement B) using IDDFS if we wanted to do that.

@stale stale bot removed the inactive label Jul 10, 2023
@ericjkao
Copy link
Contributor Author

ericjkao commented Jul 10, 2023 via email

@charlieegan3
Copy link
Contributor

charlieegan3 commented Jul 10, 2023

Indeed, I'm just imagining what we might actually do here if we were to make a change.

I think the options are:

X) update docs only
Y) edit the existing function
Z) add a new function

X seems pretty likely, Y seems pretty unlikely, Z also feels pretty unlikely without a clearer use case.

@benweint
Copy link

@charlieegan3 regarding whether to just update the docs or make a behavior change / new function, I want to point out that in #6128, which was marked as a dup of this issue, there's a test case which demonstrates that the current implementation not only omits some paths, but also concocts nonsensical paths (where some path steps don't correspond to actual edges in the input graph) in some cases.

To me, this is a strong indicator that this warrants an actual behavior change for this builtin.

@anderseknert
Copy link
Member

Yeah, that should just be considered a bug and get fixed @benweint 👍 Perhaps it should be treated as a different issue and be re-opened, but OTOH it also makes sense that we agree on what the expected behavior of this built-in function should be before we do that. Do you have any opinion on #5871 (comment) ?

@benweint
Copy link

@anderseknert agreed - we should figure out the desired behavior first!

Option B (all paths of any length from the initial set to any other node with no cycles) is what I expected / wanted the function to do.

A would be fine, except for the caveat that there may not be any leaf nodes in the case of cycle-containing graphs.

@benweint
Copy link

For my specific use case at the moment, it actually would be sufficient to have a function that takes a graph, a set of initial nodes, and a target node, and returns a path (or set of paths) from the initial set to the target, but I'm not sure that that's generally useful enough to be it's own function. (It's also easy enough to implement from option B by just filtering the set of returned paths)

@anderseknert
Copy link
Member

Agreed, and it seems most of us do, so perhaps we should go with that. While there are valid concerns about backwards compatibility, I would consider the current behavior a bug. @ashutosh-narkar WDYT?

@ashutosh-narkar
Copy link
Member

I would have expected option B as well. So this seems to be a bug. Both not returning all the paths as well as returning invalid paths.

@stale
Copy link

stale bot commented Sep 3, 2023

This issue has been automatically marked as inactive because it has not had any activity in the last 30 days. Although currently inactive, the issue could still be considered and actively worked on in the future. More details about the use-case this issue attempts to address, the value provided by completing it or possible solutions to resolve it would help to prioritize the issue.

@stale stale bot added the inactive label Sep 3, 2023
anakrish added a commit to anakrish/regorus that referenced this issue Dec 31, 2023
- graph.reachable, graph.reachable_paths builtins.
- In progress walk builtin
Note: graph.reachable_paths is buggy in upstream OPA and its semantics
  are not well defined.
  open-policy-agent/opa#5871
  open-policy-agent/opa#6128
- Use correct feature for regex
- Ensure that regorus complies with all features disabled.

Signed-off-by: Anand Krishnamoorthi <anakrish@microsoft.com>
anakrish added a commit to microsoft/regorus that referenced this issue Dec 31, 2023
- Disable default features of jsonschema to reduce binary size (#85)
- graph.reachable, graph.reachable_paths builtins.
- In progress walk builtin
Note: graph.reachable_paths is buggy in upstream OPA and its semantics
  are not well defined.
  open-policy-agent/opa#5871
  open-policy-agent/opa#6128
- Use correct feature for regex
- Ensure that regorus complies with all features disabled.

Signed-off-by: Anand Krishnamoorthi <anakrish@microsoft.com>
@ashutosh-narkar ashutosh-narkar moved this to Backlog in Open Policy Agent Aug 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Backlog
Development

No branches or pull requests

5 participants