-
Notifications
You must be signed in to change notification settings - Fork 143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unbound query through associations #917
Comments
My initial thought is to potentially augment our That said:
|
I think my issue is that we have no way of paging those associations even if we did add a limit, and then we also have no way of telling the end user that the limit was applied and they only received a partial payload. My gut feeling is just to straight disallow has_manys via attributes unless there are specific columns defined. For example
Some associations, like I'm almost wondering if we should just remove |
My concern is if we disallow has_manys, we're breaking compatibility, https://www.manageiq.org/docs/reference/latest/api/overview/query.html while I'm not opposed to doing this for big ticket items (vms, etc.), it's the smaller associations that folks have been querying and expect to get back. expand=<subcollection> allows api callers to fetch multiple subcollections in a single query to help reduce roundtrips (api predating GraphQL 😄 ) expand=<subcollection> would get properly href's items vs just an association query. I do like the idea of requiring specific columns, like vms.name. Maybe an option is to have a blacklist of associations that we know can tend to be large (vms, events, notificatioons, etc), and only apply the new logic to those ? |
This made me think that possibly what we could do instead of trying to "take things away" is instead add something to make tracking this issues down easier. For us, looking through the logs is fine enough (assuming we can get access to them), but would possibly providing an automated process for analyzing those logs, and presenting them via the UI/API be a better solution? My thinking is that the problem stems from users not knowing they are making poor choices with the API, and generally we have ways of mitigating those performance problems already and they just need to be used. By taking things away, we are just making it more complex to use the API, not easier.
This might work, though, I think with my thought above, possibly we make this configurable instead of something that is hardcoded in the app. We can set it with "sane defaults", but allow the users to shoot themselves in the foot if they so choose. |
I agree, but I can't see anything we can do that could prevent that, besides, well, actually preventing that. If we leave the capability intact, people are going to use it, which leads to killing the workers, and I think we have to prioritize a user shooting themselves in the foot (by killing the app) over a user's convenience. If we could find a way to keep the capability, but not make the workers overload themselves, I'd be all for it (for example, maybe find_in_batches might help, but even so, JSON dumping an enormous payload will likely always kill memory) |
I'm kind of on the fence. While I like the idea, it does have some level of inconsistency (some things are queryable, others are not). If we make it configurable, on a minor note, it makes documenting these inconsistencies difficult/confusing, in that we'd have to say "by default, vms are not queryable, but the administrator can enable it) |
While we have the max_results_per_page, which can keep memory limits down, associations do not abide by that value, allowing for an unbounded query.
So, for example, let's say you have 1 provider with 50K vms. If you hit
/api/vms
or/api/providers/1/vms
you will be limited to 1000 vms per page. However if you hit/api/providers?expand=resources&attributes=vms
, then you will get back all 50K vms fully expanded.I'm not sure what approach we can take here, but I'm thinking maybe we just disallow full expansion of association attributes, and instead force selecting specific columns (via #871)? Or perhaps disallow when the size of the association is greater than the max_results_per_page?
cc @NickLaMuro @abellotti @jrafanie
The text was updated successfully, but these errors were encountered: