-
Notifications
You must be signed in to change notification settings - Fork 535
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor RecordAccessors to ResourceFinders #1045
Conversation
I'm definitely in favor of this architecture change. 👍 :-D. Regarding caching: so if I'm following correctly, what's currently called the "resource cache" would be split into two parts, a "model cache" and a "serialization cache", in order to decouple as much of the caching code as possible from any particular ORM, and allow it to work at least partially with unusual ORMs and on POROs. My initial impression is that this is doable and makes sense. One thing though is that I'm not sure a model cache (that is, a cache that returns actual ORM instances) would provide much speed benefit in most cases. I suggest using proxy models instead:
Given all that, the serializer could be in charge of interacting with the actual cache of JSON:
|
(I've edited the above comment after posting it, apologies if you already read through it before those changes) |
@DavidMikeSimon I would like to think that, if we get the architecture correct, this proxy model concept could be implemented without knowledge of the resource. I would rather expand the interface of the |
I propose we keep the same methodology used in v0.9 where the id and chache_field are plucked from the database and cache hits (serialized resource fragments) for those are retreived. Then the remaining resources are fetched in a second call to This can be seen in the updated diagram: The model cache would be a secondary cache to the resource fragments and could save the time when the model hasn't changed but a serialization parameter such as fields has. I'm not sure if JR needs to implement a model caching solution or if we can let users leverage existing solutions. Thoughts? |
@dgeb Agreed, Resources and other code outside of the accessor itself shouldn't have to know or care if a record is real or a proxy. My apologies if that was unclear, my comment was a bit of a raw brain core-dump. :-) |
@DavidMikeSimon Haha ... no worries. Some aspects weren't clear to me, but I'm glad we agree on that core constraint. @lgebhardt I like the idea of moving the code that handles caching for serialized fragments into the direct client of the serializer - the operations processor. This will keep caching concepts from leaking into adjacent layers 👍 |
@lgebhardt I like the idea of a pluck method on accessors, that's an elegant place to add support for cache key plucking to other ORMs, and also it is potentially much less complex than either the current approach in 0.9 or my proxy models proposal. A couple questions:
However, an advantage of the plucking approach is that it would probably be faster. I'm not sure how much faster, perhaps only slightly, perhaps significantly. |
@lgebhardt A possible problem with doing cache stuff outside the serializer: the data values in the So, the serializer has to be able to manipulate the fetched cache fragments before they are sent, not just generate new ones. This may make it complicated to remove caching code from the serializer. |
@DavidMikeSimon I might not understand the Proxy Model idea entirely, but I don't see a clean way around the potential for N+1 queries, which with includes could be quite a large N. So I've been thinking of an idea that is similar to the In the case of a request without any includes the processor can pluck the When we have includes it gets more complicated. In a normal rails active relation call with includes behind the scene the models will be fetched with one call and then the related models will be fetched with an additional call, based on the ids from the primary model (I think there's a few few cases where rails manages to do all this in one call, but I'm not positive). This is a nice and simple trade off between database calls and simplicity of the SQL. You will end up with your number of includes, NI+1 queries. I've been poking at the includes and the number of queries run with includes. There's a chance for a large number of queries based on the dataset. I hadn't noticed this before because I'm guessing the test data was hiding it. I think we can solve all this with a strategy using pluck for the primary resource and a separate pluck for each level of the includes. This will give us an array of related ids for each level of the include directives. This can then be resolved to cache hits and a call to resolve misses. This means when caching is on we'd have a pluck and retrieve misses query per level of include, plus a pair of queries for the primary resource. We might be able to extend this to non-cached resources as well. I think it will play nicely with the eager loading, but I'm not sure. A final benefit is this approach may provide an opportunity to paginate included resources. We could default to tossing out the extra ids from the pluck_includes but some DBs would allow us to group and limit by the parent id. Tossing ids without instantiating models isn't ideal, but is likely a workable solution (at least better than returning them all as we currently do). I hope I'm not missing anything major. |
@lgebhardt Your description actually matches the current system's behavior pretty closely, in The intention of proxy models is to (a) formalize the structure "an id and a cache field" and (b) allow for a safe (albeit slow) fallback in weird cases. Also in the future possibly (c) allow fields other than id and cache field to be plucked, to allow Resources to implement unusual security behaviors and such without triggering slow lazy loads. To avoid N+1 on cache misses, the serializer will have to be a little aware of the general existence of proxy models, however it will not have to care about whether any given record is real or a proxy. It just needs a method to ask the accessor to lazy load multiple records (which may or may not be proxies) in one swoop. This (the calling code inside serializer) might look something like: # Assume that either proxy or real records already exist for every included association
# from source, including indirect associations.
def preload_from_cache(source):
# Assume get_all_records_by_type returns a hash, where each key is a resource type and
# each value is an array of all the records (proxy or real) anywhere in the inclusion
# tree that belong to resources of that type.
types = get_all_records_by_type(source)
types.each do |t, records|
hits, misses = cache.get(records)
# Hits is an array of CachedFragments for some of the records
# Misses is an array of records for which there's no up-to-date cached serialization
@preloaded_fragments[t] ||= {}
hits.each do |fragment|
@preloaded_fragments[t][fragment.id] = fragment
end
if misses.length > 0
# Assume load_all has the same effect as lazy loading each individual
# miss, but in only one query (e.g. it does a find_by_ids), and that it
# is a no-op on real records. On accessors which do not support proxies,
# the implementation of this method would be empty.
t._record_accessor.load_all(misses)
misses.each do |record|
fragment = CachedFragment.new(object_hash(record))
cache.insert(fragment)
@preloaded_fragments[t][fragment.id] = fragment
end
end
end
end After calling |
Also @lgebhardt I am very 👍 on paginating included resources, it plugs a big potential resource hog scenario. |
d987063
to
b44184e
Compare
a26aab4
to
74713cb
Compare
I've updated this PR with the following changes:
ToDo:
It's likely I've missed some breaking changes. I'd like to get those documented, so if you encounter problems where this breaks your implementation please note it here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a massive but much needed refactor. I left a few minor questions / suggestions. Looking forward to merging!
end | ||
# classes.each do |klass| | ||
# klass.caching(false) | ||
# end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just delete instead of commenting these out?
test/unit/resource/resource_test.rb
Outdated
end | ||
end | ||
end | ||
# def test_to_many_relationship_pagination |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a ToDo:
since (I think) we want to use this eventually.
# require File.expand_path('../../../test_helper', __FILE__) | ||
# require 'jsonapi-resources' | ||
# require 'json' | ||
# |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another ToDo
needed
links: { | ||
self: "/posts/2/relationships/comments", | ||
related: "/posts/2/comments" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another ToDo
needed.
end | ||
end | ||
rescue => e | ||
handle_exceptions(e) | ||
end | ||
render_response_document | ||
end | ||
|
||
def run_in_transaction(transactional) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method run_in_transaction
sounds like it should always be transactional. Perhaps the method should be renamed or the transactional
param should be dropped?
lib/jsonapi/configuration.rb
Outdated
@@ -1,7 +1,7 @@ | |||
require 'jsonapi/formatter' | |||
require 'jsonapi/processor' | |||
require 'jsonapi/record_accessor' | |||
require 'jsonapi/active_record_accessor' | |||
# require 'jsonapi/resource_finder' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Delete instead of comment out?
lib/jsonapi/relationship.rb
Outdated
# else | ||
# type | ||
# end | ||
# end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be deleted?
lib/jsonapi/relationship.rb
Outdated
@@ -68,6 +88,10 @@ def redefined_pkey? | |||
belongs_to? && primary_key != resource_klass._default_primary_key | |||
end | |||
|
|||
def inverse_relationship | |||
@inverse_relationship | |||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps use attr_reader
?
74713cb
to
e9ace8b
Compare
e9ace8b
to
bdcbf61
Compare
Woohoo! :) |
next unless Module === klass | ||
if ActiveRecord::Base > klass | ||
klass.reflect_on_all_associations(:has_many).select{|r| r.options[:as] }.each do |reflection| | ||
(hash[reflection.options[:as]] ||= []) << klass.name.downcase |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
context: causing trouble on 9.10->10.2 upgrade
I happen to have an active record base class Class.new(User)
that is being found in object space and since it doesn't have a class name, is crashing on klass.name.downcase
...
A few questions:
- Does this need to be a lookup in object space vs. ActiveRecord::Base.descendants?
- Could it should it call
model_name.name.downcase
instead of.name.downcase
. That at least gives anArgumentError: Class name cannot be blank. You need to supply a name argument wh en anonymous class given
- would you accept a PR to make this method easier to customize the various parts of this? 1) the selection of candidate classes, 2) the selection of record classes from item 1 3) the processing of ar record classes that got through 1 and 2?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noting a few areas of changes such as in inferring the class which changes from 0.9 -> 0.10 (and 0.11 dev at time of writing)
(to be honest, I accidentally included too many comments in this review comment. sorry about that. I had some pending when I made it. )
@@ -379,115 +342,56 @@ def related_link(source, relationship) | |||
link_builder.relationships_related_link(source, relationship) | |||
end | |||
|
|||
def to_one_linkage(source, relationship) | |||
linkage_id = foreign_key_value(source, relationship) | |||
linkage_type = format_key(relationship.type_for_source(source)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
upgrade-polymorphic
def type_for_source(source) | ||
if polymorphic? | ||
resource = source.public_send(name) | ||
resource.class._type if resource |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
upgrade-polymorphic
if _model_class && _model_class.ancestors.collect { |ancestor| ancestor.name }.include?('ActiveRecord::Base') | ||
model_association = _model_class.reflect_on_association(relationship_name) | ||
if model_association | ||
options = options.reverse_merge(class_name: model_association.class_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
upgrade:inferred class_name
} | ||
|
||
related_resources = source_resource.public_send(relationship_type, rel_opts) | ||
source_resource = source_klass.find_by_key(source_id, context: context, fields: fields) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
upgrade change
filters: { resource_klass._primary_key => resource.id } | ||
} | ||
|
||
resource_set = find_resource_set(resource_klass, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
upgrade
options[:cache] = resource_klass.caching? | ||
resources = {} | ||
|
||
identities = resource_klass.find_fragments(find_options[:filters], options) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
upgrade
relationship = _relationship(relationship_name) | ||
|
||
if relationship.polymorphic? && relationship.foreign_key_on == :self | ||
find_related_polymorphic_fragments(source_rids, relationship, options) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:upgrade
|
||
pluck_fields = [primary_key, related_key, related_type] | ||
|
||
relations = relationship.polymorphic_relations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
upgrade:find_related_polymorphic_fragments
if relationship.polymorphic? | ||
table_alias = relationship.parent_resource._table_name | ||
|
||
relation_name = polymorphic_relation_name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
upgrade
def serialize_to_hash(source) | ||
@top_level_sources = Set.new([source].flatten(1).compact.map {|s| top_level_source_key(s) }) | ||
# Converts a resource_set to a hash, conforming to the JSONAPI structure | ||
def serialize_resource_set_to_hash(result_set) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
upgrade
Background
I originally intended for JR to be able to support different ORMs. However, the architecture of JR has slowly drifted off course, and now needs a correction to meet this goal.
Originally, the
Resource
class was heavily tied to SQL throughActiveRecord::Relation
. With that in mind I created #977, which broke out the calls to the DB layer into aRecordAccessor
class. However, there were several problems with this implementation (see #1030 for some context). One, it actually returns resources, so it's not really aRecordAccessor
(not sure where my head was). Two, I think it's the wrong abstraction - aRecordAccessor
should return records (i.e. models), not resources. Finally, theActiveRecordAccessor
attempts to handle caching as well by returning either resources or serialized json from the cache.Luckily, this implementation has not seen the light of day in a release, and I'd like to take the time needed to straighten it out.
Goals
The broad goals of this PR are to:
What's in this PR
ActiveRecordAccessor
toActiveRelationRecordAccessor
ActiveModel::Model
forActiveRecord
complianceWhat's left to do?
Caching needs to be reworked to fit in the desired layers. A lot of this is TBD. One proposal is to split the caching into two types:
This approach to caching would allow Serializer caching to work across all ORMs, even if model caching isn't implemented for a particular ORM.
Here's a diagram to illustrate the layers and the caching locations I'm proposing: