Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jetty-12 Review Handler/Processor design #8929

Closed
gregw opened this issue Nov 23, 2022 · 10 comments
Closed

Jetty-12 Review Handler/Processor design #8929

gregw opened this issue Nov 23, 2022 · 10 comments

Comments

@gregw
Copy link
Contributor

gregw commented Nov 23, 2022

Jetty version(s)
12

Background

This issue is to discuss some alternatives to #8793, which whilst initially inspired by reusable Processors, was ultimately a PR about cleaning up some strange aspects of the current jetty-12 API.

The Current Design

The current design of Handler and Request.Processor is designed to:

  • allow for a blocking phase of calling the Handler tree to determine which handler (if any) would handle the request. Instead of calling request.setHandled(true) as in previous jetty versions, acceptance of request handling is now indicated by returning a Request.Processor.
  • only enable the full request API (e.g. read and demand) only once the request has been accepted (by returning a Request.Processor)
  • only provide the response and callback once the request is accepted, via an async call to process(request, response, callback);

The Problem(s)

  • Any Handler that needs to wrap the request, must also wrap the Request.Processor only so it can capture and re-use the request (wrapped in handle()) in process(). The extra allocation can be avoided by having a WrapperProcessor that is-a Request.Wrapper that also implements Request.Processor, but this results in further complication and obscure code.
  • Because the request instance is passed to both Handler.handle(Request) and Request.Processor(Request, Response, Callback), there are often cases where there are two request instances and one must be ignored. We have already seen bugs where the wrong request has been used.
  • To avoid the allocation of a Request.Processor wrapper, some handlers use the WrapperProcessor the processors they return are one use only and cannot be reused for other requests (even though the signature looks like it can be). This results in two classes of Processor, those that must be able to be reused (like those used as error processors) and those that must not be reused (like some returned from handle()).
  • It is often unclear what handling of the request should be done in handle() vs process(). For example, GzipHandler does some handling in handle(), more in a processor wrapper and more in the wrapped request/response/callback.

Discussion

There are at least two styles of Handler: ones that wish to process the request; ones that wish to wrap the request that will be processed by another handler:

    public class ExampleProcessingHandler extends Handler.Abstract implements Request.Processor
    {
        @Override
        public Request.Processor handle(Request ignored) throws Exception
        {
            return this;
        }

        @Override
        public void process(Request request, Response response, Callback callback) throws Exception
        {
            // process the request, produce the response and complete the callback.
            callback.succeeded();
        }
    }

    public class ExampleWrappingHandler extends Handler.Wrapper
    {
        @Override
        public Request.Processor handle(Request request) throws Exception
        {
            Request.Wrapper wrapper = new Request.Wrapper(request);
            Request.Processor processor = super.handle(wrapper);
            if (processor == null)
                return null;
            // need to wrap/capture processor, just to remember the wrapper
            return (ignored, response, callback) ->
            {
                assert wrapper.getWrapped() == ignored;
                processor.process(wrapper, response, callback);
            };
        }
    }

Observations:

  • The ExampleProcessingHandler is simple enough, although it ignores the request passed to handle().
  • The ExampleWrappingHandler must wrap the Request.Processor even though it has no processing of its own to do. The processor is wrapped only so that the wrapped request instance can be remembered and substituted for the ignored request passed to process(). We could do the assert, but we never do (and that is a band-aid anyway).
  • We have replaced the simple setHandled(true) as a means of accepting the request with the return of a Request.Processor, which is a new entity that needs to be wrapped that is in addition to the Handler, Request, Response and Callback, which all can be wrapped. In an attempt to reduce allocations, we make either the Handler or the Request implement Processor, but each results in an extra request object that needs to be ignored.

Alternative 0
Keep the design, but improve convenience implementations to hide the duplicate request:

    public class ExampleProcessingHandler extends Handler.Processor
    {
        @Override
        public void process(Request request, Response response, Callback callback) throws Exception
        {
            // process the request, produce the response and complete the callback.
            callback.succeeded();
        }
    }

    public class ExampleWrappingHandler extends Handler.Wrapper
    {
        @Override
        public Request.Processor handle(Request request) throws Exception
        {
            Request.WrapperProcessor wrapper = new Request.WrapperProcessor(request)
            return wrapper.wrapProcessor(super.handle(wrapper));
        }
    }

This has already been done to some extent, yet it doesn't need an example to be much more complex than these simple ones before the duplicate re-appears. Perhaps more can be done.

Alternative 1

One way to avoid having an ignored request is to get rid of the second request object by simply not passing it in process:

    public class ExampleProcessingHandler extends Handler.Abstract implements Request.Processor
    {
        @Override
        public Request.Processor handle(Request request) throws Exception
        {
            // We must wrap/capture the request to use it in process
            return (response, callback) ->
            {
                // process the request, produce the response and complete the callback.
                callback.succeeded();
            };
        }
    }

    public class ExampleWrappingHandler extends Handler.Wrapper
    {
        @Override
        public Request.Processor handle(Request request) throws Exception
        {
            Request.Wrapper wrapper = new Request.Wrapper(request);
            Request.Processor processor = super.handle(request);
            if (processor == null)
                return null;
            // need to wrap/capture processor, just to remember the wrapper
            return (response, callback) ->
            {
                processor.process(wrapper, response, callback);
            };
        }
    }

This now avoid the ignored requests entirely but:

  • Now ExampleProcessingHandler must always allocate a processor just to capture the request. This is now an expensive way to avoid a boolean acceptance.
  • The ExampleWrappingHandler is more or less as before.
  • A Request.Processor can now no longer be used for error handling, as no request is provided. Either a new interface with the request is needed for error handling, or we need to go back to an ErrorHandler and call both handle and process
  • We have found elsewhere with methods that only take the response, that it is better to also pass the request as it is often needed. It will be strange that we pass request where it might sometimes be needed, but don't pass it where it is always needed, simply because there is another source of the request.

Alternative 2
Go back to a single method on Handler with a mechanism to explicitly accept the request. We remove the whole concept of Request.Processor and instead only have Handler

    public class ExampleProcessingHandler extends Handler.Abstract
    {
        @Override
        public void handle(Request request, Response response, Callback callback) throws Exception
        {
            // accept the request
            request.accept();
            // process the request, produce the response and complete the callback.
            callback.succeeded();
        }
    }

    public class ExampleWrappingHandler extends Handler.Wrapper
    {
        @Override
        public void handle(Request request, Response response, Callback callback) throws Exception
        {
            Request.Wrapper wrapper = new Request.Wrapper(request);
            super.handle(wrapper, response, callback);
        }
    }

Note that:

  • Aspects of the request API will be inactive until accepted (e.g. reading and demanding).
  • The response and callback can also be made inactive until the request is accepted.
  • acceptance cannot be indicated by a boolean return, else there would be a race between any async processing started and the returned true value being used to activate the request.
  • The only allocation is for the request wrapper because we want to wrap it. There are no allocations just to hold a value not provided by the API.
  • This is similar to our jetty<12 handlers, with accept() replacing setHandled(true)
  • The handler tree is navigated only once per request.
  • There are no Processors to be reused. Handlers by their very nature are reusable.
  • There is no question of where to put the handling (handle vs process) as there is only one.

Alternative N
You tell me!

Conclusion
We have introduced a whole new entity that needs to be instantiated/wrapped/defined/documented simply in order to avoid a call like request.accept() and to avoid exposing response and callback objects before they are active. I think that made more sense when we tried to have different request APIs for the two calls, but it no longer makes sense now they are the same (to allow only one wrapping). I think we need to think about this a bit more.

@lorban
Copy link
Contributor

lorban commented Nov 23, 2022

Throwing in some other API idea:

    public class ExampleProcessingHandler extends Handler.Abstract
    {
        @Override
        public void handle(Request request) throws Exception
        {
            // accept the request
            Response response = request.accept();
            // process the request, produce the response and complete the callback.
            response.getCallback().succeeded();
        }
    }

    public class ExampleWrappingHandler extends Handler.Wrapper
    {
        @Override
        public void handle(Request request) throws Exception
        {
            Request.Wrapper wrapper = new Request.Wrapper(request);
            super.handle(wrapper);
        }
    }

@sbordet
Copy link
Contributor

sbordet commented Nov 23, 2022

@lorban I think we tried the idea Response r = request.accept() at a certain point, but we discarded it.
Also, it is going to be a pain to wrap the callback (happens often) because you now need to wrap the response (happens rarely), which means that you have to wrap the request to override Request.accept().

@sbordet
Copy link
Contributor

sbordet commented Nov 23, 2022

Another problem.
A handler that wants to add response headers would need to store the additional headers into a HttpFields, then wrap the response to "merge" those new headers with existing ones, which means that Response.getHeaders() must be called before the request is accepted.

And so the response can now be "touched" before the request is accepted, but only certain methods like Response.getHeaders(), which is a model that has exceptions over exceptions to the rule "don't touch the response before the request is accepted".

As an example see SniSslConnectionFactoryTest.start(...).

The split of the calls between handle() and process() allows for wrapping at different times, rather than trying to follow a rule ("don't touch response"), but then having to make an exception ("oh, but you can call response.getHeaders()") to implement a use case.

@sbordet
Copy link
Contributor

sbordet commented Nov 23, 2022

Current model

Pros

  • More flexible API
    • Allows to do things between handle() and process()
  • API results in cheaper implementations
    • For example GzipHandler does not do all the work if no descendant Handler handles the request
  • Less error prone about using the Response and Callback -- they are only available in Processor
  • Clear non-racy model between handle() and process()
  • Request completion only depends on completing the Callback

Cons

  • When wrapping request, must wrap Processor too
    • Possible confusion about what request wrapper needs to be passed to the Processor
  • Possibly more allocating
    • For example, ResourceHandler needs to wrap the Processor only which is a capturing lambda.
  • Not clear if Processors can be reused

Back to Square 11 Model

Pros:

  • No Processor so no need to wrap it -- only requests needs to be wrapped
  • Possibly less allocating -- when there is no need to wrap the request
  • Handlers are by definition reusable

Cons

  • Response and Callback are untouchable until Request.accept() is called.
  • Either racy or implicitly synchronous "accept" when returning from handle(R,R,C)
    • For example, after returning from handle(RRC) we may need to know if the request has been accepted, so we need Request.isAccepted() which is either racy (if called asynchronously) or it must have an implicit contract that it must be called within the context of the handle(RRC) call.
  • Request.accept() and Request.isAccepted() can be wrapped and lie about whether the request is accepted
  • More expensive implementations
    • For example, GZipHandler must do all the work upfront, wrap RRC (all 3) but then no descendant Handler is handling the request, so all that work for nothing.
  • Now every Handler must remember that it has to call accept() and complete the Callback

@gregw
Copy link
Contributor Author

gregw commented Nov 23, 2022

Response and Callback are untouchable until Request.accept() is called

Sure it it better to hide these instances before they are used, but even with separate handle/process they can be misused.
We already have the state inside the HttpChannel to enforce they are not used if we want.

Note that in current jetty<12, we do have handlers that set response headers without producing a full response. Allowing later handlers to complete. I guess the only difference is that currently in 12 we can know if something later has accepted the request before we start messing with the response..... however, it only needs a DelayedHandler or similar to be installed and then we always accept and do the actual work in process.

  • More expensive implementations

    • For example, GZipHandler must do all the work upfront, wrap RRC (all 3) but then no descendant Handler is handling the request, so all that work for nothing.

Ah so because we currently only need to wrap the request, then look for a handler, we can avoid wrapping response and callback if there is no process (i.e. no accept).

For GzipHandler, that is not such a big problem, as it tends t o be deployed high up the tree and thus there will always be something accepting it (unless it is a 404).

But if it were a problem, that could speak to a solution that makes the response/callback available from a return from accept, so accept could then be wrapped to wrap them (not that I like wrapping to wrap other things).

@gregw
Copy link
Contributor Author

gregw commented Nov 24, 2022

@sbordet @lorban So thinking about how to implement handlers that want to modify response headers I can see a couple of options:

  • We can do as in jetty 9/10/11 and just let handlers modify response headers even if they don't know if subsequent handlers have accepted the request or not.
  • We can do a little be better than 9/10/11 and if not accepted those handlers could reverse their changes.
  • The handler could wrap the Request#accept method and only set the headers after calling super.accept. This means that the headers are only modified if the request is accepted and we would do so after any switch to make the headers mutable.
  • The handler could also wrap the response and give out a wrapped response headers if it wanted to pretend the header had already been set
  • The handler could wrap the response and callback so that the headers are modified as the request is committed. Unfortunately this requires both write and succeeded to be wrapped, thus two wrappers are needed (although nothing stopping us having combined wrappers of request and/or response and/or callback).

So whilst it is a bit of an issue, it is no worse than 9/10/11, but there are several things we can do to be better than 9/10/11

@gregw
Copy link
Contributor Author

gregw commented Nov 24, 2022

I think there is also an interesting use-case for wrapping accept in StatisticsHandler, as by taking a time stamp in a wrapped accept, we can differentiate between latency in navigating the Handler tree looking for something to accept the request and the latency actually processing and producing the response. This is analogous to the time in handle vs time in process.

So I'm not thinking that wrapping accept() is a CON, but rather a necessary feature.

@gregw
Copy link
Contributor Author

gregw commented Nov 26, 2022

@sbordet I'm experimenting with this in jetty-12.0.x...jetty-12-handler-as-processor
I've kept Request.Processor as an interface that is now implemented by Handler. This has meant a trivial adaption of all Handler.Processor extensions (which could actually be just Handler.Abstract if they did their own accepting). I've started converting the more complex handlers such as ContextHandler, StatisticsHandler, GzipHandler and so far I've not found anything particularly difficult, complex or expensive.

There is a lot more work to do to make this even testable... but I'm wondering what are the handlers you are most concerned with for this approach? I'll tackle those first.

@sbordet
Copy link
Contributor

sbordet commented Nov 28, 2022

@gregw we know that handlers can be implemented in both ways, so I'm not concerned about whether they can be implemented or not.

However, I think you should look into the cross-origin handler, delayed handlers, statistics handler, gzip handler, and the capability of re-handling.

The cross-origin handler (which is not implemented yet) is perhaps an important use case, since it needs to modify response headers upfront before calling the application, which may result in a 404.

Also, the ResourceHandler is going to be different between the 2 models, because it won't need to split the calls to ResourceService in getHttpContent() and doGet().

@gregw
Copy link
Contributor Author

gregw commented Nov 28, 2022

A big CON of the current model is in ContextHandler, as whenever a thread calls into the context we need to:

  • Set the context thread local
  • Get and save the current Thread context classloader
  • Set the the current Thread context classloader
  • Call any listeners wanting to know about context entry (often non null for establishing authentication)
  • Call the method in the context
  • Call any listeners wanting to know about context exit
  • reset the current Thread context classloader
  • reset the context ThreadLocal

Currently this has to be done on the handle call, then do it all again on the process call. Avoiding the double entry will save a bit of work.... but benchmarking will show if it is significant or not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants