buffer for chunked streaming breaks queries thru chproxy #417

bobelev · 2024-11-01T12:27:20Z

0.8.0 introduced a buffer for HTTP streaming/chunked queries. This feature breaks integration with some types of ClickHouse proxies. For example a popular chproxy is affected.

The reason is simple. chproxy returns non-chunked response with a valid content-length and no transfer-encoding header.

urllib3.exceptions.ResponseNotChunked: Response is not chunked. Header 'transfer-encoding: chunked' is missing.

There's no way to walk around this buffer.

I understand that this lib is for ClickHouse and chproxy might be out of scope, but there's some hope :)

Steps to reproduce

use clickhouse-connect >= 0.8.0
use chproxy
get exception

Expected behaviour

Code example

from clickhouse_connect.driver import create_client

def test_chproxy_connection():
    test_client = create_client(host="127.0.0.1", port=9090, username='default', password="")
    query = f"SELECT count() from system.one"
    res = test_client.query(query)
    assert res.result_rows[0][0] == 1

clickhouse-connect logs

../../clickhouse_connect/driver/client.py:218: in query
    return self._query_with_context(query_context)
../../clickhouse_connect/driver/httpclient.py:237: in _query_with_context
    query_result = self._transform.parse_response(byte_source, context)
../../clickhouse_connect/driver/transform.py:68: in parse_response
    first_block = get_block()
../../clickhouse_connect/driver/transform.py:33: in get_block
    num_cols = source.read_leb128()
../../clickhouse_connect/driver/buffer.py:65: in read_leb128
    b = self.read_byte()
../../clickhouse_connect/driver/buffer.py:51: in read_byte
    chunk = next(self.gen, None)
../../clickhouse_connect/driver/httputil.py:232: in buffered
    chunk = next(read_gen, None) # Always try to read at least one chunk if there are any left
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <urllib3.response.HTTPResponse object at 0x12840c970>, amt = 1048576
decode_content = False

    def read_chunked(
        self, amt: int | None = None, decode_content: bool | None = None
    ) -> typing.Generator[bytes, None, None]:
        self._init_decoder()
        # FIXME: Rewrite this method and make it a class with a better structured logic.
        if not self.chunked:
>           raise ResponseNotChunked(
                "Response is not chunked. "
                "Header 'transfer-encoding: chunked' is missing."
            )
E           urllib3.exceptions.ResponseNotChunked: Response is not chunked. Header 'transfer-encoding: chunked' is missing.

The text was updated successfully, but these errors were encountered:

genzgd · 2024-11-01T15:36:13Z

Thanks for the detailed report. I suspect that chproxy "unchunks" the responses as part of its caching approach and the library should definitely support that. My apologies for missing that use case. We should have a fix in the next day or two.

bobelev added the bug Something isn't working label Nov 1, 2024

genzgd mentioned this issue Nov 1, 2024

Fix chunked streaming and wait_end_of_query bug #421

Merged

2 tasks

genzgd closed this as completed in #421 Nov 1, 2024

lqhl mentioned this issue Nov 4, 2024

[BUG] Incompatibility with ClickHouse API ContentSquare/chproxy#476

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

buffer for chunked streaming breaks queries thru chproxy #417

buffer for chunked streaming breaks queries thru chproxy #417

bobelev commented Nov 1, 2024

genzgd commented Nov 1, 2024

buffer for chunked streaming breaks queries thru chproxy #417

buffer for chunked streaming breaks queries thru chproxy #417

Comments

bobelev commented Nov 1, 2024

Steps to reproduce

Expected behaviour

Code example

clickhouse-connect logs

genzgd commented Nov 1, 2024