Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache Param For get_json() Leading to Contradictory Behavior #1698

Closed
AndTheDaysGoBy opened this issue Jan 8, 2020 · 7 comments · Fixed by #2003
Closed

Cache Param For get_json() Leading to Contradictory Behavior #1698

AndTheDaysGoBy opened this issue Jan 8, 2020 · 7 comments · Fixed by #2003
Milestone

Comments

@AndTheDaysGoBy
Copy link

The location of interest:

class JSONMixin(object):

Example:

from flask import Response
r = Response(response=’{“detail”:”Bad Gateway”}’, status=502, content_type=’application/json’)

# Initial
assert r.data === b’{“detail”:”Bad Gateway”}’
assert r.get_data() === b’{“detail”:”Bad Gateway”}’
assert r.json == {‘detail’: ‘Bad Gateway’}
assert r.get_json() == {‘detail’: ‘Bad Gateway’}

# Set
r.set_data(‘{“a”:”b”}’)

# Data field updated
assert r.data == b‘{“a”:”b”}’
assert r.get_data() == b'{“a”:”b”}’

# JSON field non-cached is correct
assert r.get_json(cache=False) == {‘a’: ‘b’}

# Cache not refreshed (Fails)
assert r.get_json() == {'a': 'b'}

In the above example, due to the set_data, the data changes. Using get_json(cache=False) works because it'll re-parses the data. However, since cache = False, this value will not be placed into the cache for the future. If I hadn't used get_json(cache=False), it would've just returned the cached value (which is wrong).
Therefore, it seems that performing set_data, makes it so that every subsequent call would require get_json(cache=False). My how would be something like get_json(reparse=True) and then all subsequently calls can just be from the cache.

Now, what the cache parameter represents becomes important, does it mean "cache the current value" or "return the currently cached value". The issue I see is that it's playing both roles (See:

if cache and self._cached_json[silent] is not Ellipsis:
and )
Consequently, should the data change, if you used cache = True, you'll just return the currently cached value, but if you use cache = False, you wont set the cache to the new value because of line:

Therefore, I recommend separating out the use cache and set cache functionalities (i.e. two different kwargs), or, have cache=True solely be used for obtaining the value from the cache, not for setting it. I.e. it is performing get_json(cache=False) that perform the set (I.e. remove if cache check here:

@davidism
Copy link
Member

davidism commented Jan 9, 2020

I'd recommend not mixing cache in the first place. Under what real circumstances is this an issue?

@AndTheDaysGoBy
Copy link
Author

AndTheDaysGoBy commented Jan 9, 2020

@davidism Wouldn't it be any time you perform set_data and then perform get_json()? E.g. suppose you modify a response in a flask after_request and then have a further manipulation on it as a dictionary, thus requiring you to perform get_json(). This would fail.

I guess I'm just trying to say, set_data manipulations aren't truly respected by get_json() due to its caching functionality.

@davidism
Copy link
Member

davidism commented Jan 9, 2020

But why are you performing that set/get on a response object multiple times? Response.get_json is pretty much intended for use during testing.

@AndTheDaysGoBy
Copy link
Author

If it's purely for testing, then it makes more sense. Although, for behavior to change based off of whether something was called or not seems a bit weird. And, if it's going to cache (as it does) one would hope it'd update along with the underlying data. That's why I put the word contradictory in the title.

As for why I'm performing the multiple gets with a set in between, it's actually a single get in the actual code, a single set in the actual code, but then a third get in the unit test.
I.e. the code performs

@app.after_request
def custom_response(response):
    a = response.get_json()
    new_a = {"something": a['b'], "another": a['c']}
    response.set_data(json.dumps(new_a))
    return response

However, it's then my unit test which will perform a get_json() on the response of this function (to assert the change did indeed occur), but will fail.

@davidism
Copy link
Member

It probably doesn't make sense to cache get_json() at all on Response, since data isn't "cached" either, it just is what it is. I was getting confused because on Request caching makes sense because there is no set_data().

@AndTheDaysGoBy
Copy link
Author

So, from a purist perspective, that JSONMixin (which houses the caching version of get_json) shouldn't be a parent of Response? Although, then, if you still wanted JSON operations on the response, you'd have to have two versions of JSONMixin, i.e. one for requests and one for responses.

@davidism
Copy link
Member

This will be fixed as part of #1963, I'll move the JSON behavior into the Request and Response classes and remove the caching for Response.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jan 30, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants