Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How do I implement efficient data fetches for nested objects? #167

Closed
brandur opened this issue Apr 12, 2018 · 6 comments
Closed

How do I implement efficient data fetches for nested objects? #167

brandur opened this issue Apr 12, 2018 · 6 comments

Comments

@brandur
Copy link

brandur commented Apr 12, 2018

Hello, great work with the project. I have a question pertaining to how to build an efficient implementations given nested objects.

To demonstrate, take a sample blog schema where you have articles that have comments, and comments that have favorites. Favorites are their own relation with an associated user_id to the user that favorited that comment.

A query to fetch all the data you need to render an article would go two relations deep like this:

article(article_id: 123) {
    comments {
        favorites {
            user_id
        }
    }
}

The system is implemented on a relational store, and a simplified version of your Juniper article object looks like this:

graphql_object!(<'a> &'a Article: Context |&self| {
    field comments(&executor) -> Vec<Comment> {
        // SELECT * FROM comments WHERE article_id = self.article_id
    }
}

Comments are similar, with a nested favorites field:

graphql_object!(<'a> &'a Comment: Context |&self| {
    field favorites(&executor) -> Vec<Favorite> {
        // SELECT * FROM favorites WHERE comment_id = self.comment_id
    }
}

The trouble is that if we execute the query above, it will resolve successfully, but it will do so in a way that's degenerately inefficient. To get all our favorites we'll execute N + 1 total queries (1 to get comments, and then N to get favorites for each comment), and every further level of nesting will multiply the total number of queries by another N.

What we'd do ideally is when fetching favorites, use this naive implementation above if we're only fetching them for a single comment, but when we're fetching them for a set of comments, have a higher level execute only a single query:

SELECT * FROM favorites
WHERE comment_id IN (comment_id11, comment_id2, comment_id3)
ORDER BY comment_id;

And then partition the results and distribute them to the underlying "favorites" objects being resolved.

Do you have any recommendations for how to make this sort of pattern possible in a relatively performant and sustainable (in the sense of code complexity) way? I've found some references to "context switching", and looking the source code suggests that it seems like something that might be what I'm looking for, but the documentation for it is light. Is that what I should be using?

Thanks!

@LegNeato
Copy link
Member

Generally Facebook suggests using something like https://github.com/cksac/dataloader-rs, which is what they do internally

@brandur
Copy link
Author

brandur commented Apr 13, 2018

@LegNeato Ah, thank you. Yeah, I saw that Data Loader seems to be common in GraphQL implementations from other languages. If this is the right way, it might be helpful to have an example of its integration with Juniper — given heavy reliance on futures, etc., it's somewhat non-trivial to integrate.

@theduke
Copy link
Member

theduke commented Apr 22, 2018

There are essentially two approaches to this.

One is a futures and dataloader style async approach, tracked in #2.
With the large changes in currently happening in the Futures/Tokio ecosystem, I'm afraid this is still a bit farther on the horizon.

The other option is to inspect the requested schema and smartly determining what to fetch in a root resolver.
There is a PR for this, and the tracking issue is #124 .

@theduke
Copy link
Member

theduke commented Apr 22, 2018

Closing this issue here, feel free to discuss further in one of the two other issues.

@theduke theduke closed this as completed Apr 22, 2018
@brandur
Copy link
Author

brandur commented Apr 22, 2018

Thanks @theduke. #16 turns out to be exactly what I'm looking for here.

@brokenthorn
Copy link

AKA the GraphQL N+1 Problem #387 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants