Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[4.x] WebClient should have a mode that is resilient to bad media/content types #6774

Closed
trentjeff opened this issue May 4, 2023 · 9 comments · Fixed by #6999
Closed

[4.x] WebClient should have a mode that is resilient to bad media/content types #6774

trentjeff opened this issue May 4, 2023 · 9 comments · Fixed by #6999
Assignees
Labels
4.x Version 4.x bug Something isn't working P2

Comments

@trentjeff
Copy link
Member

trentjeff commented May 4, 2023

Here is a snippet of an HTTP response from a server:

> User-Agent: curl/7.79.1
> Accept: */*
> 
* Mark bundle as not supporting multiuse
* HTTP 1.0, assume close after body
< HTTP/1.0 200 OK
< Content-Type: text
< Content-Length: 280
< 

Some parts were removed for privacy issues

As you will notice the server is returning an invalid content type of text - see https://www.iana.org/assignments/media-types/media-types.xhtml#text for a list of all valid media/content types.

Our WebClient should support a mode (e.g., relaxed mode vs the current mode which I'll call strict) that will log a warning about these invalid usages, but should not throw an exception. Even nicer would be to translate text -> text/plain for this specific case in relaxed mode.

I was able to reproduce this exception behavior in the following unit test of MediaTypeTest on main:

    @Test
    void parseEmptySubtypeParameterValue() {
        assertThat(HttpMediaType.create("text/html").parameters(), is(Map.of()));
        // vvv this one will fail
        assertThat(HttpMediaType.create("text").parameters(), is(Map.of()));
        assertThat(HttpMediaType.create("text; o1=; o2=v2").parameters(), is(Map.of("o2", "v2")));
    }

If we are in a relaxed mode, the WebClient should not throw an exception - we should just log a warning. Doing so will make Helidon more resilient for users to consume since they currently have no other workaround other than to switch to HttpClient, etc.

Here is the full exception reported by the user from 3.1.0:

java.util.concurrent.ExecutionException: java.lang.IllegalArgumentException: Could not parse 'text'
        at java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:396)
        at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2073)
        at io.helidon.common.reactive.Single.get(Single.java:730)
        ... a bunch more
Caused by: java.lang.IllegalArgumentException: Could not parse 'text'
        at io.helidon.common.http.MediaType.parse(MediaType.java:360)
        at java.base/java.util.Optional.map(Optional.java:260)
        at io.helidon.webclient.WebClientResponseHeadersImpl.contentType(WebClientResponseHeadersImpl.java:79)
        ... a bunch more
Caused by: java.lang.IllegalStateException: No more elements!
        at io.helidon.common.http.Tokenizer.checkState(Tokenizer.java:173)
        at io.helidon.common.http.Tokenizer.checkState(Tokenizer.java:158)
        at io.helidon.common.http.Tokenizer.consumeCharacter(Tokenizer.java:108)
        at io.helidon.common.http.MediaType.parse(MediaType.java:326)
        ... 59 more
@trentjeff trentjeff added 3.x Issues for 3.x version branch 4.x Version 4.x labels May 4, 2023
@trentjeff
Copy link
Member Author

Related to #3956

@m0mus m0mus added bug Something isn't working P2 labels May 11, 2023
@Tomas-Kraus
Copy link
Member

I'll have a look a this one.

@Tomas-Kraus
Copy link
Member

Tomas-Kraus commented Jun 12, 2023

@tomas-langer @trentjeff Just small design note.
Jeff mentioned mode keyword with 2 values: strict and relaxed. I'll start with adding this into ClientConfig, marking strict as default. And when media-type won't be defined or will be of unknown type and client mode is configured as relaxed, it will just use text/plain as default and write something ugly into logs.

@romain-grecourt
Copy link
Contributor

The default setup of the WebClient must work out of the box.
If it works with curl or HttpURLConnection, it should work with WebClient.

@Tomas-Kraus
Copy link
Member

Tomas-Kraus commented Jun 13, 2023

@romain-grecourt You are right.
I added following test:

package io.helidon.nima.webclient.http1;

import io.helidon.common.http.Headers;
import io.helidon.common.http.Http;
import io.helidon.nima.webclient.ClientResponse;
import io.helidon.nima.webclient.HttpClient;
import io.helidon.nima.webclient.WebClient;
import io.helidon.nima.webserver.WebServer;
import io.helidon.nima.webserver.http.HttpRules;
import io.helidon.nima.webserver.http.HttpService;
import io.helidon.nima.webserver.http.ServerRequest;
import io.helidon.nima.webserver.http.ServerResponse;

import org.junit.jupiter.api.AfterAll;
import org.junit.jupiter.api.BeforeAll;
import org.junit.jupiter.api.Test;

import static org.hamcrest.MatcherAssert.assertThat;
import static org.hamcrest.Matchers.is;

class HeadersTest {

    private static WebServer server;

    @BeforeAll
    static void beforeAll() {
        server = WebServer.builder()
                .routing(routing -> routing.register("/test", new TestService()).build())
                .start();
    }

    @AfterAll
    static void afterAll() {
        server.stop();
    }

    @Test
    public void testInvalidContentType() {
        HttpClient client = WebClient.builder()
                .baseUri("http://localhost:" + server.port() + "/test")
                .build();
        ClientResponse res = client.method(Http.Method.GET)
                .path("/invalidContentType")
                .request();
        Headers h = res.headers();
        Http.HeaderValue contentType = h.get(Http.Header.CONTENT_TYPE);
        assertThat(res.status(), is(Http.Status.OK_200));
        assertThat(contentType.value(), is("invalid header name"));
    }

    static final class TestService implements HttpService {

        TestService() {
        }

        @Override
        public void routing(HttpRules rules) {
            rules.get("/invalidContentType", this::invalidContentType);
        }

        private void invalidContentType(ServerRequest request, ServerResponse response) {
            response.header(Http.Header.CONTENT_TYPE, "invalid header name")
                    .send();
        }

    }


}

And it passes without any modifications of the Nima WebClient.
And it returns the same Content-Type header String that I set on the server side.

@trentjeff So now I'm not sure what this relaxed mode shall exactly do. Client does not throw anything and header String contains value from server. Do we really need to rewrite this header value to text/plain? I don't think it's a good idea.

@Tomas-Kraus
Copy link
Member

@trentjeff Now I got it from Tomas L. message. It's just about the simple text -> text/plain rule. I was thinking too much abstract.

@trentjeff
Copy link
Member Author

Sounds good to me.

Tomas-Kraus added a commit to Tomas-Kraus/helidon that referenced this issue Jun 14, 2023
…t to bad media/content types

Signed-off-by: Tomáš Kraus <tomas.kraus@oracle.com>
Tomas-Kraus added a commit to Tomas-Kraus/helidon that referenced this issue Jun 14, 2023
…t to bad media/content types

Signed-off-by: Tomáš Kraus <tomas.kraus@oracle.com>
Tomas-Kraus added a commit to Tomas-Kraus/helidon that referenced this issue Jun 15, 2023
…t to bad media/content types

Signed-off-by: Tomáš Kraus <tomas.kraus@oracle.com>
Tomas-Kraus added a commit that referenced this issue Jun 16, 2023
…edia/content types

Signed-off-by: Tomáš Kraus <tomas.kraus@oracle.com>
@romain-grecourt
Copy link
Contributor

Seems like we might also want the enhancement for the server, see #7053

@romain-grecourt
Copy link
Contributor

And, we might want to consider backporting this to 3.x and 2.x

@romain-grecourt romain-grecourt removed the 3.x Issues for 3.x version branch label Jun 26, 2023
@romain-grecourt romain-grecourt changed the title WebClient should have a mode that is resilient to bad media/content types [4.x] WebClient should have a mode that is resilient to bad media/content types Jun 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
4.x Version 4.x bug Something isn't working P2
Projects
Archived in project
5 participants