Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

beresp 304 header merging makes impossible to distinguish between headers set by varnish and backend #3102

Open
nigoroll opened this issue Oct 21, 2019 · 22 comments
Assignees
Labels
a=feedback please a=NextVCL things which need a VCL bump r=7.0

Comments

@nigoroll
Copy link
Member

nigoroll commented Oct 21, 2019

Reported by @martin-uplex in https://varnish-cache.org/lists/pipermail/varnish-dev/2019-October/009469.html

Basically, the way we merge headers for 304 responses makes it impossible to distinguish headers originating from the backend from ones which we set in VCL (for the previous response).

@bsdphk
Copy link
Contributor

bsdphk commented Oct 21, 2019

Just to make sure I understand the scenario: The Cache-Control: nocache is created in VCL ?

@bsdphk
Copy link
Contributor

bsdphk commented Oct 21, 2019

We talked at one point about 304 responses getting their own dedicated VCL method, where both beresp.* (RW) and obj.* (RO) were available.

I wonder how much of the current C-lang 304 logic we could implement in builtin_vcl if we did that ?

@martin-uplex
Copy link
Contributor

Just to make sure I understand the scenario: The Cache-Control: nocache is created in VCL ?

Yes, correct.

@nigoroll
Copy link
Member Author

nigoroll commented Oct 21, 2019

@bsdphk yes to the first question: The scenario is to modify the headers of the cache-object with respect to the downstream behavior, but have a differing ttl/cacheable status.
And yes, I think that an additional VCL method would help and IIUC this was my preferred option when we initially discussed the 304 implementation.

@bsdphk
Copy link
Contributor

bsdphk commented Oct 21, 2019

But didnt we find out back then that we would still need C-magic to make cond-fetch work ?

@nigoroll
Copy link
Member Author

nigoroll commented Oct 21, 2019

Discussed on IRC: I think this model should work:

  • before vcl_backend_response {} , for a 304, call vcl_backend_refresh {} with beresp.* (r/w) and obj.* (r/o)
  • builtin.vcl would just contain sub vcl_backend_refresh { return(deliver); }
  • the header merge, as exists now, would run after vcl_backend_refresh {}:
    • beresp. headers always replace obj. headers
  • 304-aware code would basically do almost all of the work (except deletion, see below) in vcl_backend_refresh {} and skip most of the processing in vcl_backend_response {} based on beresp.was_304
  • The only special case (which I see) would be header deletion, which requires cooperation between vcl_backend_refresh {} and vcl_backend_response {}: _refresh would set some marker (e.g. in a header or variable) and _response would do the actual deletion.

Unless I miss something, this solution should be 100% compatible with existing VCL and allow for advanced handling of 304s only where needed.

If there was a need, we could facilitate header deletion also by marking headers not to be merged (as a vmod function in tandem with some core code addition).

@martin-uplex can you please review if this suggestion would work for you?

@martin-uplex
Copy link
Contributor

discussed with @nigoroll, he will follow up on this

suggestion in short: no vcl_backend_refresh, but make obj.* (r/o) available in vcl_backend_response, filled with same values as beresp.* if not 304

@dridi
Copy link
Member

dridi commented Oct 22, 2019

sub vcl_backend_revalidate {
    if (beresp.backend.name ~ "legacy") {
        return (replace);
    }
}

# built-in

sub vcl_backend_revalidate {
    beresp.merge(obj);
    return (reuse);
}

I have been confronted to misbehaving backends in the past, being able to ignore them would be a plus. You can already ignore them in v_b_response by removing condition headers. Nit-picking on the "v_b_refresh" name, and especially the return (deliver) transition if it's meant to lead to v_b_response.

@nigoroll
Copy link
Member Author

@dridi how would you handle a misbehaving backend when you already have a 304 response (with no body)? I do not see how return (replace) vs. return (reuse) would work, if that is referring to the body.

regarding the header merge, making it explicit would be another option, which would also facilitate the header deletion, yes.

@dridi
Copy link
Member

dridi commented Oct 22, 2019

Right, forget that part of my suggestion...

@slimhazard
Copy link
Contributor

Qs about beresp.merge(obj):

  • Can it be called with any other object besides obj? If so, then what other object could make sense as the argument? If not, then wouldn't it be better if there is no argument? Say beresp.merge() or beresp.merge_obj()?

  • What happens if it doesn't get called? Then varnishd does the default merging? Or does that mean you don't want the merge? In the latter case, the documentation should probably point out that the HTTP standard has some requirements about merging headers of the object to be validated with headers from the validating response, so it's a "use at your own risk" option. I would assume then that the default in builtin.vcl would be to call the merge.

@dridi
Copy link
Member

dridi commented Oct 22, 2019

Can it be called with any other object besides obj?

Yes, it can be called in a different context with different arguments.

sub vcl_init {
    my_replicator.request(...).merge(req);
    my_replicator.send();
    # where the request() function returns an HTTP
}

What happens if it doesn't get called?

My suggestion to move (and expose) this to VCL introduces a risk. Much like I can break a bereq or a resp today by messing with it in pure VCL code. I'm definitely leaning towards the "use at your own risk" side.

@hermunn
Copy link
Member

hermunn commented Oct 23, 2019

* What happens if it doesn't get called?

We have to choose if the default behavior is to remain in the core or be moved to the builtin VCL. In my humble opinion, if we introduce a .merge(), then we should not do any merging in the core, but do it in VCL.

On a return (reuse) we have the option of stealing the body and headers, RFC be damned.

... the documentation should probably point out that the HTTP standard has some requirements about merging headers of the object to be validated with headers from the validating response, so it's a "use at your own risk" option.

Yeah, this is true, and for many app writers, not well understood. Having the option to amend that in VCL is a natural request.

@dridi
Copy link
Member

dridi commented Oct 23, 2019

One use case we might also consider is not merging surrogate keys for example:

sub vcl_backend_revalidate {
    if (beresp.http.xkey) {
        # don't keep track of stale surrogate keys
        set beresp.http.new_xkey = beresp.http.xkey;
        beresp.merge(obj);
        set beresp.http.xkey = beresp.http.new_xkey;
        unset beresp.http.new_xkey;
        return (TBD); # my reuse vs replace transition didn't make sense
    }
}

@hermunn
Copy link
Member

hermunn commented Oct 23, 2019

        set beresp.http.new_xkey = beresp.http.xkey;

Well, a std.collect is in order here, and this illustrates a maybe sore point in all of this - Varnish does not work great with repeated headers.

@dridi
Copy link
Member

dridi commented Oct 23, 2019

set beresp.http.new_xkey = beresp.http.xkey.collect();

# or

beresp.http.new_xkey.clone(beresp.http.xkey);

Now that we have type properties and methods, we can make vmod_header redundant by enhancing the VCL_HEADER and VCL_HTTP types.

@nigoroll
Copy link
Member Author

bugwash: @dridi and myself agree that sub vcl_backend_refresh preferred

@nigoroll
Copy link
Member Author

nigoroll commented Feb 7, 2020

Discussed with @dridi and @bsdphk

  • We agree that a new sub vcl_backend_refresh is the best option
    • return values: ok/retry/abandon/fail
  • The 304 handling should move to vcl, by default calling merge_304;

I will prepare some mock up vcl illustrating how the relevant use cases will look like

Also:

@bsdphk bsdphk added a=NextVCL things which need a VCL bump r=7.0 labels Feb 24, 2020
@nigoroll
Copy link
Member Author

FYI, working on the promised VCL mockup proposal made me realize that not even the current master version of the cache rfc can possibly work:

For each stored response identified for update, the cache MUST use the header fields provided in the 304 (Not Modified) response to replace all instances of the corresponding header fields in the stored response.

As noticed here and in #3169, if origin servers wrongly update some headers (like Content-Length, Content-Encoding or maybe Content-Type), bad things might happen.

The corresponding discussion seems to take place in httpwg/http-core#165 and the latest sensible proposal seems to be httpwg/http-core#337

I will follow this proposal and maybe participate in the discussion if need to.

@nigoroll
Copy link
Member Author

TODO also: consider https://cache-tests.fyi/#conditional when writing the VIP

@dridi
Copy link
Member

dridi commented Sep 27, 2023

Bugwash suggestion:

# built-in
sub vcl_backend_refresh {
        return (merge);
}

In this subroutine we have access to beresp (read-write) and obj_stale (read-only) and the available transitions are:

  • merge (merge obj_stale with beresp)
  • replace (take beresp as-is)
  • fail
  • abandon
  • error?

@nigoroll
Copy link
Member Author

nigoroll commented Sep 27, 2023

VDD CONSENSUS:

1.Do not add new functionality unless an implementor cannot complete
a real application without it.

  • vcl_backend_refresh
  • beresp r/w access
  • obj_stale r/o access
  • return (merge, obj_stale, beresp, fail, abandon, error, retry);

merge: existing logic

  • change status to 200 if beresp.status == 304
  • header merge logic as is

obj_stale: undo the result of the sub, just copy everything from obj_stale

beresp: use beresp as it is (including status)

beresp.was_304 is going to stay true

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a=feedback please a=NextVCL things which need a VCL bump r=7.0
Projects
None yet
Development

No branches or pull requests

7 participants