-
Notifications
You must be signed in to change notification settings - Fork 192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
REST API: POST endpoint for QueryBuilder queryhelp JSON payloads #4337
REST API: POST endpoint for QueryBuilder queryhelp JSON payloads #4337
Conversation
Codecov Report
@@ Coverage Diff @@
## develop #4337 +/- ##
===========================================
+ Coverage 79.46% 79.52% +0.06%
===========================================
Files 484 484
Lines 35775 35820 +45
===========================================
+ Hits 28426 28481 +55
+ Misses 7349 7339 -10
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
Thanks! One quick comment - queries might become very expensive. Should we put in some safeguards to avoid users creating (inadvertently or on purpose) DOS attacks?
|
Very interesting! And good catch!
Easily done, and probably a good idea to include in combination with some actual timout (or other) handling when it's enabled.
Do you mean for number of returned results?
I have looked quickly into this, it seems there indeed is a way to set the PostgreSQL timeout time for handling a query, see |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @CasperWA - I think this is potentially extremely useful, and so I would encourage you to go further in this direction, e.g. by adding some basic tests (no need to go overboard, just a few is fine).
1694515
to
e1020b4
Compare
@CasperWA I've just met with @flavianojs who is working on the REST API for the aiida-ginestra plugin. Please let him know what you think would be the best way forward to get this PR merged. |
7a60fce
to
f7471fd
Compare
Update
|
Hmm, I'm not adverse to the solution here, What are your thoughts on your implementation vs the graphql one? |
The GraphQL one introduces a different kind of query language and query methodology. I have already noted the connection to the GraphQL implementation in the related issue #3646. In summary, I don't see any issue with having both implementations, especially since this is a natural extension of the existing REST API to utilize the existing Python functionality directly. |
Sorry for not participating much in the discussion before. Wrt to general interfaces like this one I think one argument against it in REST was always that it breaks the possibility to have (an effective) intermediate cache of the queries: most REST queries could potentially be cached on the document level via a reverse proxy, with such a general interface you always have to hit the DB (or an object cache). A GraphQL-approach shares some of the drawbacks when it comes to cacheability (see link below), but you still maintain an intermediate layer between the internal data representation and the query (which also has its downside of course). Due to its generic nature one can employ a GraphQL-specific caching mechanism for the objects (which is again a caching on the returned data-representation rather than on the source objects). If this is about developing clients which might access the objects/data directly, I am slowly turning towards classical server-side generated sites again. The main reason being the latency and the browsers capability of progressive rendering: a new client will hit you first for the webapplication, which once it has loaded has to hit the server again for data = 2 roundtrips plus application startup time, while the server side generated webapp works in 1. Authentication makes it even more complex. Unless of course you need really live data streams, for which GraphQL has again a solution, question would only be how to wire this up to the backend data model again. Now, many of those points apply to general purpose designs, while AiiDA operates in a rather narrow and specialized field, in which some of the points may be weighted differently. |
thanks @dev-zero; what he said ☝️ 😆 |
I don't believe that this is true. Even in the response by @dev-zero there are clear differences and use cases for the two approaches. Furthermore, the
This is not a case of implementing a functionality similar to GraphQL, however it is indeed inspired by it. Specifically @dev-zero's work with the And thanks @dev-zero for sharing your comments here as well 👍 The caching issue is something I have considered, but not more than a second, since this in my eyes is a first "test" implementation, to see whether this functionality is useful or not. One can always create a plugin and extend the REST API with similar functionality if needed, but starting to implement it in |
Happy for @dev-zero to set me straight, but to my eyes this is exactly the same use case as GraphQL: Querying a database via a web-orientated API. But anyhow, I've said my piece and however we do it I think this will certainly be very useful 👍 |
I would consider that the same as saying Windows and OS X is exactly the same due to the same use case of creating a graphical operating system... 😅 |
Are you trying to make my point for me 😝 because yes I would say they're pretty much the same these days; sure I can use VS Code, Firefox and Zoom on both, which takes care of ~90% of all I do 😂 |
Hehe, fair enough 😅. In an attempt to be painfully clear though, my point is more that the specific use case will change things substantially. In the most current use case a Java GUI application already exists that uses the |
Further update:
|
Final, final note; this is exactly how you can use graphql: you can literally just add it as an endpoint to Flask: https://strawberry.rocks/docs/flask |
I know. I believe this is the approach used in the This was not a matter of implementation but of use case and query language. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @CasperWA - just one more question regarding the full type
node_entry['full_type'] = ( | ||
construct_full_type(node_entry.get('node_type'), node_entry.get('process_type')) | ||
if node_entry.get('node_type') or node_entry.get('process_type') else None | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see, so here is where the full_type
is set?
Just for me to understand: This is the NodeTranslator
class, but for some reason it also seems to be used to translate objects that are not nodes - is that the source of the problem?
And, finally, I guess you tried before instead of setting full_type
to None
to simply not set it here, but then tests break?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, so this is what I tried to explain in previous comment answers.
The only way to get all the various REST API-specific information in the response (and make it the most similar to other REST API responses), I need to use NodeTranslator
as the translator class for the /querybuilder
-endpoint.
This is because the NodeTranslator
is special, adding full_type
, which is not a property that otherwise exists in AiiDA. It's a quirk of the REST API.
As far as I know, this is the only addition to an AiiDA entity, and as such, if I use NodeTranslator
for all entities, I make sure I also include the special REST API properties. The need for the subsequent removal of full_type
should now be clear - it's not a property of any other AiiDA entity than Node
s. And even then, to define full_type
both node_type
and process_type
are needed. If these properties are not requested in the POSTed queryhelp, they will not be available.
I could here ensure that they're always requested, but that would demand a lot of logic to go through the POSTed queryhelp. Something I didn't feel was necessary for this PR at this point. So instead I've opted for the current solution.
It's worth noting that the construct_full_type
utility function actually had a bug. It used process_type
twice (for both node_type
and process_type
parts of full_type
). This PR fixes that bug, as well as remove the necessity for try/except
in the highlighted code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And, finally, I guess you tried before instead of setting
full_type
toNone
to simply not set it here, but then tests break?
It was more of a conscious choice to make the response as AiiDA REST API-like as possible. Including all the extra properties expected from any other AiiDA REST API response for the various entities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main issue here relates to the fact that the REST API was not built to return multiple types of entities (this has been mentioned in issue #4676 as well). So I need a "catch-em-all"/"works-for-all"-translator :)
31c40dd
to
428e5aa
Compare
docs still failing? |
Yeah. I didn't touch any docs, so I'm confused. Also the pre-commit is failing due to files that I didn't touch either. |
428e5aa
to
74441c9
Compare
Alright @ltalirz, I've completed splitting this up in a first commit that fixes some minor things in the REST API code base:
The second commit concerns the purpose of this PR specifically. |
Both are due to release of |
I'll just merge this with rebase+merge then, is that fine @sphuber, @ltalirz, @chrisjsewell? Or do you want to wait for #4669 and/or do a different kind of merge? |
I am giving it a final pass now
Please wait until #4669 is merged and then rebase. We still want the tests to pass before merging |
indeed, tests should always pass before merging; cough @ramirezfranciscof cough #4675 😉 |
Alright, the other PR is merged. You can rebase this and the tests should pass. Unless there were other failures of course. |
74441c9
to
f95df2c
Compare
Right. So the pre-commit is still failling due to files I didn't touch. @sphuber, @chrisjsewell ? |
yeh its a known issue that has cropped up before. I thought it had been fixed, but obviously not |
65046e1
to
fda9197
Compare
- Use node_type in construct_full_type(). - Don't use try/except for determining full_type. - Remove unnecessary try/except in App for catch_internal_server. - Use proper API_CONFIG for configure_api.
The POST endpoint returns what the QueryBuilder would return, when providing it with a proper queryhelp dictionary. Furthermore, it returns the entities/results in the "standard" REST API format - with the exception of `link_type` and `link_label` keys for links. However, these particular keys are still present as `type` and `label`, respectively. The special Node property `full_type` will be removed from any entity, if its value is `None`. There are two cases where this will be True: - If the entity is not a `Node`; and - If neither `node_type` or `process_type` are among the projected properties for any given `Node`. Concerning security: The /querybuilder-endpoint can be toggled on/off with the configuration parameter `CLI_DEFAULTS['POSTING']`. Added this to `verdi restapi` as `--posting/--no-posting` option. The option is hidden by default, as the naming may be changed in the future. Reviewed by @ltalirz.
fe7c387
to
4c9d44a
Compare
@chrisjsewell / @sphuber feel free to review the PR based on the latest commit 4c9d44a. |
Closes #3646
This is the first pass at implementing a
/query
-endpoint with an HTTP POST functionality to pass aQueryBuilder
queryhelp JSON object and receive the results back in the "standard" AiiDA REST API data format.A few things differ for the returned "standard" data format:
All entities will have the(See comment below).full_type
key.This is a small price to pay (in my opinion) in order to get the value for this for custom plugin
Node
s when relevant (theNodeTranslator
is used for this endpoint to be as general as possible).link_type
andlink_label
keys, but these will still be present as the regulartype
andlabel
keys, respectively, which is returned normally fromQueryBuilder
.We may consider to also remove these extra keys from the regular output in other endpoints with links, but that's for another time.
So far, I have tested this with a variety of
Node
s, trying to get incoming and outgoing, projecting various and differing things. The latter has led to the code's current state that ensure that what is returned is equivalent to what theQueryBuilder
would normally return.I have also tested retrieving different entities, e.g., the
User
related to aNode
. This also works fine.For now, I have implemented it such that the payload (the posted queryhelp JSON) will be taken at face value and passed on. This is done for simplicity and because I expect most will use this by setting up a
QueryBuilder
instance and passing the instance'squeryhelp
property as a payload. However, manually written queryhelp dictionaries dumped as JSON should work as well, as long as it works for theQueryBuilder
...One could create a parser, re-defining the queryhelp keys, however, I don't see the point of this right now.
Currently missing in this PR:
QueryBuilder
, but may fail in the POST functionality. (This may include custom projection values.)Somewhat fixed, specifically in terms of the
project
keyword (using wilcard (*
) or not including the keyword).Translator
class should be created for this endpoint, or if it's overkill. This will, e.g., make it able to accommodate the differences in the returned "standard" data format mentioned above, but also adds a lot of extra code, which may obscure the functionality (or make it more transparent, depending on the code-reader).Edit: This does not seem to be necessary. It can still be done in the future, but using
NodeTranslator
or similar is fine.