Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfix/schema and timeout #3

Merged
merged 7 commits into from
Jul 29, 2021
Merged

Bugfix/schema and timeout #3

merged 7 commits into from
Jul 29, 2021

Conversation

YaphetKG
Copy link
Collaborator

  • Fixes "List out of Index" showing up when redis is empty and schema graph computed is also empty
SELECT a:gene->b:disease
FROM "/schema"
WHERE a="curie:ismissing"

Fix makes behavior return an empty response with the query graph evaluated from the query.

  • Adds timeouts to queries this is configurable as per deployment via
    "REDIS_QUERY_TIMEOUT" env variable. And value is in milliseconds.

  • Adds get parameter for /schema endpoint so we can do force refresh through an api call. Could be useful when roger pipeline finishes building graph we can let tranql recompute the schema. And don't need to wait till the next update interval.

@YaphetKG YaphetKG requested review from waTeim and stevencox July 28, 2021 19:20
Copy link

@waTeim waTeim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A minor thing If in query_redis question doesn't contain options then the get on line 628 will throw an exception.

I really wish there was a more efficient way to update the cache only if a change occurs rather than blindly recalculating every x seconds.

@YaphetKG
Copy link
Collaborator Author

Hi @waTeim , thanks for the review. Just added a default to options as empty dict.
WRT schema computation, the main reason for re-doing the schema is i think when graph has changed in redisgraph.

  • One of the ways we can detect this change maybe using counts of nodes and edges per type maybe.
    Something like
grab node types and counts for each,
grab edge types and counts for each, 
make a hash value
if hash value is diff from previous one recalc schema
  • Another way maybe having a meta node in the graph , that the roger pipeline would write out , something with version and time graph was build. That way we can check if this node has changed.

  • A third option is increasing interval that we do this calc. This wouldn't change the efficiency but would reduce the frequency of making schema calc cypher query calls to redisgraph. Since we have the api endpoint to force update the schema, we can make http request to that endpoint. In this scenario , roger pipeline on airflow would build the graph and make a force_update schema call to tranql once the graph has done building.

@stevencox stevencox merged commit 06ae535 into develop Jul 29, 2021
@stevencox
Copy link

@YaphetKG - please let me know when this is tested in heal-dev - i.e. ensuring search/indexing/etc is working as expected.

@stevencox stevencox deleted the bugfix/schema-and-timeout branch September 4, 2021 00:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants