Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Customize loading behavior #14

Closed
willcosgrove opened this issue Feb 9, 2019 · 8 comments
Closed

Customize loading behavior #14

willcosgrove opened this issue Feb 9, 2019 · 8 comments

Comments

@willcosgrove
Copy link
Contributor

We just recently started using ams_lazy_relationships on a project I've been working on, and it's been great!

We have lots of relationships though then end up being modified in the serializer for one reason or another, and that prevented us from being able to use the lazy association helpers for them. I started writing custom loaders (unrelated to this gem's concept of a Loader) that use BatchLoader to fetch these records like we need. Unfortunately I discovered that just using BatchLoader was not enough because AMS was trying to serialize the related object before it had called the loader for each nested record.

I would love to be able to hook into the breadth-first loading that you've got in this gem, but I haven't been able to work out what the best way to do that would be.

If I was dreaming about what ams_lazy_relationships could do to make this situation better, it would be great if I could just define my own method that returns a BatchLoader object. Something like this:

class UserSerializer < BaseSerializer
  lazy_has_many :friends

  def friends
    BatchLoader.for(object.id).batch(default_value: []) do |user_ids, loader|
      # ...
    end
  end
end

And ams_lazy_relationships would handle calling that method at the right time.

What do you think?

@Bajena
Copy link
Owner

Bajena commented Feb 9, 2019

Unfortunately I discovered that just using BatchLoader was not enough because AMS was trying to serialize the related object before it had called the loader for each nested record.

Haha, that was exactly my main reason for writing this gem 💃

If I was dreaming about what ams_lazy_relationships could do to make this situation better, it would be great if I could just define my own method that returns a BatchLoader object.

Yeah, it'd be perfect, but I'm afraid it might be difficult to achieve due to how AMS serializes data. You could however achieve a similar result by writing custom lazy loader class.
The only additional thing you'll have to do is to yield all the data collected by batch loader (check out e.g. how SimpleHasMany loader does it).

So your serializer and loader would look somewhat like this:

class UserSerializer < BaseSerializer
  lazy_has_many :friends, loader: UserFriendsLoader.new
end

class UserFriendsLoader
  def load(user, &block)
    BatchLoader.for(user.id) do |user_ids, loader|
      all_friends = load_all_friends(user_ids)
      
      # Yield all loaded records so that the 
      # gem can prepare deeper lazy relationships for each of them
      block&.call(all_friends)

      resolve(user_ids, all_friends, loader)
    end
  end
end

Btw, maybe it'd be possible to somehow extract a base loader class to help others, because I clearly see a following repeating pattern.

  1. Gather records in batch loader
  2. Load the association records
  3. Yield loaded association records array
  4. Resolve using BatchLoader's loader mechanism

If you'd have any ideas based on your use cases let me know :)

Also make sure to check the available loader classes, because it sounds like the gem might already have a loader that'd fit your needs.

@willcosgrove
Copy link
Contributor Author

Ah, ok. That makes sense, I was missing why the block was necessary, but I think I'm following now.

I think the main thing that is currently preventing us from being able to just use a loader class is that we need access to the scope, in addition to the object, for many of our loaders. Our use of scope is wrapped in a couple of methods added to our base serializer class, so it would actually be most beneficial to have access to an instance context. I wonder if it would be possible to pass self either in place of, or in addition to object here:

module Initializer
def initialize(*)
super
self.class.send(:load_all_lazy_relationships, object)
end
end

If that is too heavy handed, then we would be able to make do with just also passing in scope.

@Bajena
Copy link
Owner

Bajena commented Feb 11, 2019

I see your point. The latter idea sounds cleaner to me. Would you be able to provide one or two real-life use cases for batch loading with serialization context? Maybe we'd be able to invent something here :)

@willcosgrove
Copy link
Contributor Author

Sorry for my radio silence. I was on a deadline to get this work done and my sense was that what would work best for us was not something that would necessarily be what you wanted for this gem. So we ultimately went with using a fork of this gem.

That being said, there were a few changes that we made that might be able to be useful to you. You can see the changes we made here: master...willcosgrove:instance-access-patch

I'll briefly walk through each commit's story:

  • a104501 - This was our initial motivation for making any change. We wanted access to the instance of the serializer because it allowed us to call methods that operate on the instance's object and scope. While I was making this change, I noticed that during the deep_load_for_yielded_records method, the associations were being loaded without initializing the serializer. This meant that to create an instance, I would need to pass the current level down into the initializer which I wasn't too thrilled about, so I decided to just do away with the depth limiting all together (I would later learn why that was there 😄)
  • 77ca84d - I discovered what might could be classified as a bug. If you have a serializer that inherits from another serializer, lazy relationships defined on the superclass serializer do not get copied over into the child serializer. This commit fixes that.
  • 5e4ef33 - Here's when I discovered what the depth limit was for. I added it back, but fishing the level through the initializer.
  • 286964c - We ultimately weren't too excited by the idea that 3 levels of associations would be fetched no matter what. We really wished it would just preload the specific associations that were requested, so this commit makes it so that only relationships that were requested by the JSONAPI includes directive are fetched. This obviously only works if you're using the JSONAPI adapter, which is fine for us because that's all we use, but it would probably be best to fallback to the previous strategy.

Now I'll show you how we're using these changes in our app. We wrote our own loader definition that looks like this:

class CustomLoader
  def initialize(&loader_definition)
    @loader_definition = loader_definition
  end

  def load(instance, _load_for, &preload)
    instance.instance_exec(preload, &@loader_definition)
  end
end

So this allows us to write something like this in the serializer

class PersonSerializer < BaseSerializer
  lazy_has_many :received_challenges, loader: CustomLoader.new { |prefetch|
    BatchLoader.for(person.id).batch(key: user_scoper, default_value: []) do |person_ids, loader|
      challenges = V4::AcceptedChallenge.where(
        receiver_id: person_ids,
        owner_id: user_scoper.person.id
      )

      prefetch.call(challenges)

      challenges.each do |challenge|
        loader.call(challenge.receiver_id) do |received_challenges|
          received_challenges << challenge
        end
      end
    end
  }
end

In the code above user_scoper is a method on BaseSerializer which fetches an object out of the scope object. user_scoper is an object that helps us filter records based on the current user. It's definitely not ideal that so much data manipulation has to happen in our serializers, but this is a big application with a lot of history, and incremental improvements take a while to propagate through the code base.

In our actual use, we never use the load_for option of the lazy relationships, so we actually redefined the lazy relationship helpers to shorten the above code. So we actually have this:

class PersonSerializer < BaseSerializer
  lazy_has_many :received_challenges do |prefetch|
    # ...
  end
end

Additionally, we don't write the call to BatchLoader in line like I showed, we have those in a service module. So here's the actual code in our code base for the above example:

class PersonSerializer < BaseSerializer
  lazy_has_many :received_challenges do |prefetch|
    BatchLoaders::AcceptedChallengeLoader
      .load_received_challenges_using_person_and_user_scoper(object, user_scoper, &prefetch)
  end
end

We would love to give back some of our changes to this project if they could be helpful. Let me know what you think about these changes, and if I could help by opening a PR implementing any or all of these changes. If you don't want any of them, that's no problem either. We're content with using our fork 😄

Thanks so much for making this gem, it's been very helpful!

@Bajena
Copy link
Owner

Bajena commented Feb 15, 2019

Awesome work 👏👏👏 I quickly read your ideas and they look interesting. I'll take a deeper look during the weekend and try to figure out the details.

Thanks for sharing your code 🙇

@Bajena
Copy link
Owner

Bajena commented Feb 17, 2019

Hey, I've read all your stuff and I have a few comments/questions:

  1. jsonapi-rb seems to already be a part of AMS. I suppose you added it to gemspec just to avoid any problems if AMS authors ever decide to abandon this dependency, right?
  2. Good catch with the inheritance bug. Would you like to post a PR with the fix?
  3. There's one problem with the approach based on JSONAPI::IncludeDirective - it doesn't take into account that JSON API renders by default ids of relations which requires the records to be loaded.
    Example:
class UserSerializer < BaseSerializer
  has_many :friends
end

In this case even if you don't explicitly send ?includes="friends" friend records will be loaded just to render their IDs.
That's the main reason why I decided that the gem will be default traverse the tree 3 (3 felt like a number big enough to handle most cases ;) ) levels down - to prepare batch loaders for the relationships that just need their ids to be rendered. It shouldn't be a big performance impact as the records that are not rendered will never get loaded (thanks to how BatchLoader works).
4. I added your change to initializer and ran the core_spec tests. I've noticed that self.class.include_directive_from_options(@instance_options) is always empty even if I pass an includes statement. Do you think I might be initializing my serializers wrongly in the specs?

From: /Users/bajena/Code/priv/ams_lazy_relationships/lib/ams_lazy_relationships/core.rb @ line 41 AmsLazyRelationships::Core::Initializer#initialize:

    38: def initialize(object, options = {})
    39:   super
    40:
 => 41:   binding.pry
    42:
    43:   prefetch = options.fetch(:prefetch) do
    44:     self.class.include_directive_from_options(@instance_options)
    45:   end
    46:
    47:   self.class.send(:load_all_lazy_relationships, object)
    48: end

[1] pry(#<Level1Serializer0>)> self.class.include_directive_from_options(@instance_options)
=> #<JSONAPI::IncludeDirective:0x00007fda146a2d40 @hash={:*=>#<JSONAPI::IncludeDirective:0x00007fda146a2a48 @hash={}, @options={:allow_wildcard=>true}>}, @options={:allow_wildcard=>true}>
  1. While working on the gem I didn't think about being able to access scope objects so I get why you changed the definition of lazy loaders to operate on serializer objects and not on records themselves. I'm however not sure if it's good that there's a need to initialize a serializer object just to prefetch the nested lazy relationships. It's hard to predict what people will put in their serializer constructors ;) Unfortunately I don't have any better idea how this problem can be solved in a different way :(
  2. What's prefetch.call(challenges) doing in your example?

If you have time let me know what you think about those points


I love how creative people can get when solving their problems. I'm happy that somebody managed to understand the 🍝 I wrote ;)

Btw, I really like the way you solve problems and how quickly catch the concepts. Are you working on any other open source projects? Maybe you'd like to work on something together?

Jan

@willcosgrove
Copy link
Contributor Author

Hey Jan!

Thanks for taking a thorough look at this! I'll respond in order:

  1. Yes, I just added it because with my addition, we're now using that dependency directly rather than indirectly, so I felt like it should be added to our list of dependencies. But you are right, it doesn't need to be there, because AMS lists it as a dependency, and if AMS ever changed that, more would break than just that class not being available any more.

  2. Thanks! Sure thing: Fix superclass lazy relationships not loading properly on subclass #15

  3. Oh wow, yes you're right! Good catch! Hmm, that would be tricky to fix. It would require an extension/replacement of the includes directive class that I'm using. I didn't realize before you pointed it out that the unused levels of relationships won't get loaded if they aren't serialized. That makes me feel better about using the levels/depth approach as opposed to the includes. I think I will reverse this commit on our fork.

  4. It's possible that it doesn't work 😳 It registered in my brain as something that I should test to confirm that it worked as I expected, but then I never did. All the more reason to reverse this change 😄

  5. I don't think it would be too hard to also pass the scope down, since the scope comes in from the top, it never changes as you go deeper and deeper into relationships. The main reason I went with creating an instance of the serializer, as opposed to just passing down object and scope, is that we have methods defined on our serializer that use scope to fetch specific values out of it. It would have required either a substantial rewrite of our serializers, or duplication of code to work with the raw scope. I was trying to minimize the number of changes to our code base, and so went with the instance strategy. Though I could accept that that might not be the best option for this gem to take, it was definitely the easiest option for us.

  6. prefetch.call(challenges) is running this block:

    lrm.loader.load(load_for_object) do |batch_records|
    deep_load_for_yielded_records(
    batch_records,
    lrm,
    level
    )
    end
    In your first message to me, you talked about this list of steps:

    1. Gather records in batch loader
    2. Load the association records
    3. Yield loaded association records array
    4. Resolve using BatchLoader's loader mechanism

    This would be step 3.


😂 This gem is great! Definitely not 🍝

Thank you! No I haven't been. I opened sourced a few things several years back when I was first getting started developing. So they probably aren't very good 😄

Maybe you'd like to work on something together?

Definitely! I'm always interested in solving a good problem!

@Bajena
Copy link
Owner

Bajena commented May 17, 2019

I'm closing this issue - I'll reopen if I decide to change the way how the gem works. Thanks for all your help @willcosgrove!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants