Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

buffer for chunked streaming breaks queries thru chproxy #417

Closed
bobelev opened this issue Nov 1, 2024 · 1 comment · Fixed by #421
Closed

buffer for chunked streaming breaks queries thru chproxy #417

bobelev opened this issue Nov 1, 2024 · 1 comment · Fixed by #421
Labels
bug Something isn't working

Comments

@bobelev
Copy link

bobelev commented Nov 1, 2024

0.8.0 introduced a buffer for HTTP streaming/chunked queries. This feature breaks integration with some types of ClickHouse proxies. For example a popular chproxy is affected.

The reason is simple. chproxy returns non-chunked response with a valid content-length and no transfer-encoding header.

urllib3.exceptions.ResponseNotChunked: Response is not chunked. Header 'transfer-encoding: chunked' is missing.

There's no way to walk around this buffer.

I understand that this lib is for ClickHouse and chproxy might be out of scope, but there's some hope :)

Steps to reproduce

  1. use clickhouse-connect >= 0.8.0
  2. use chproxy
  3. get exception

Expected behaviour

Code example

from clickhouse_connect.driver import create_client

def test_chproxy_connection():
    test_client = create_client(host="127.0.0.1", port=9090, username='default', password="")
    query = f"SELECT count() from system.one"
    res = test_client.query(query)
    assert res.result_rows[0][0] == 1

clickhouse-connect logs

../../clickhouse_connect/driver/client.py:218: in query
    return self._query_with_context(query_context)
../../clickhouse_connect/driver/httpclient.py:237: in _query_with_context
    query_result = self._transform.parse_response(byte_source, context)
../../clickhouse_connect/driver/transform.py:68: in parse_response
    first_block = get_block()
../../clickhouse_connect/driver/transform.py:33: in get_block
    num_cols = source.read_leb128()
../../clickhouse_connect/driver/buffer.py:65: in read_leb128
    b = self.read_byte()
../../clickhouse_connect/driver/buffer.py:51: in read_byte
    chunk = next(self.gen, None)
../../clickhouse_connect/driver/httputil.py:232: in buffered
    chunk = next(read_gen, None) # Always try to read at least one chunk if there are any left
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <urllib3.response.HTTPResponse object at 0x12840c970>, amt = 1048576
decode_content = False

    def read_chunked(
        self, amt: int | None = None, decode_content: bool | None = None
    ) -> typing.Generator[bytes, None, None]:
        self._init_decoder()
        # FIXME: Rewrite this method and make it a class with a better structured logic.
        if not self.chunked:
>           raise ResponseNotChunked(
                "Response is not chunked. "
                "Header 'transfer-encoding: chunked' is missing."
            )
E           urllib3.exceptions.ResponseNotChunked: Response is not chunked. Header 'transfer-encoding: chunked' is missing.
@bobelev bobelev added the bug Something isn't working label Nov 1, 2024
@genzgd
Copy link
Collaborator

genzgd commented Nov 1, 2024

Thanks for the detailed report. I suspect that chproxy "unchunks" the responses as part of its caching approach and the library should definitely support that. My apologies for missing that use case. We should have a fix in the next day or two.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants