Streaming response wrapping for custom transport layers #3078

psymbio · 2024-02-01T18:10:53Z

psymbio
Feb 1, 2024

Not sure if this should be raised in urllib3 or this library, apologies in advance.

I wanted to understand how to wrap a streaming response back into httpx if we use urllib3 - particularly for the openai library. So, for the simple non-streaming response the code is as follows. But since urllib3 supports streaming, I would like to wrap the response back in httpx and return it.

Without streaming:

class URLLib3Transport(httpx.BaseTransport):
    def __init__(self):
        self.pool = urllib3.PoolManager()

    def handle_request(self, request: httpx.Request):
        payload = json.loads(request.content.decode("utf-8").replace("'",'"'))
        urllib3_response = self.pool.request(request.method, str(request.url), headers=request.headers, json=payload)
        content = json.loads(urllib3_response.data.decode('utf-8'))
        stream = httpx.ByteStream(json.dumps(content).encode("utf-8"))
        headers = [(b"content-type", b"application/json")]
        return httpx.Response(200, headers=headers, stream=stream)

With streaming (incomplete code):

class URLLib3Transport(httpx.BaseTransport):
    def __init__(self):
        pass

    def handle_request(self, request: httpx.Request):
        payload = json.loads(request.content.decode("utf-8").replace("'",'"'))
        urllib3_response = urllib3.request(request.method, str(request.url), headers=request.headers, json=payload, preload_content=False)
        for chunk in urllib3_response.stream():
            chunk_dict = json.loads(chunk.decode('utf-8'))
            # print(chunk_dict)
            print(chunk_dict['choices'][0]['message']['content'])

This results in the following output:

(Verse 1)
In a world full of chaos, where troubles never cease,
I'll sing a little melody, to bring you some peace.

(Verse 1)
In a world full of wonder, I'll sing you a song,
Where melodies dance and the words flow along.
With every
Sure! Here's a short song for you:

(Verse 1)
In the morning light, a brand new day,
I'll sing a song

Which, is not a truly a streamed response which would look like: (Verse 1)Oh, there once was a time when dreams were just out of sight, But now I realize they can come true with all coming word by word using this code - instead of printing Verse(1) three times like the above code.

Also, I wanted to understand how well this integrates with openai code (@RobertCraigie, @rattrayalex), i.e., the stream parameter in openai_client.chat.completions.with_raw_response.create() to able to switch in the custom transport layer based on this parameter's value. This would add value to the openai library capability when we move from a server to a client-based solution which can be supported on the web browser.

Link for the entire code - which fails for streaming.

tomchristie · 2024-02-01T21:51:48Z

tomchristie
Feb 1, 2024
Maintainer

Something like this...

class URLLib3Stream(httpx.SyncByteStream):
    def __init__(self, stream):
        self._stream = stream

    def __iter__(self):
        ...

    def close(self):
        ...


class URLLib3Transport(httpx.BaseTransport):
    def __init__(self):
        self._pool = urllib3.ConnectionPool()

    def handle_request(self, request: httpx.Request):
        urllib3_response = self._pool.request(
            method=...,
            url=...,
            headers=...,
            stream=...,
        )
        return Response(
            status=...,
            headers=...,
            stream=URLLib3Stream(urllib3_response.stream)
        )

    def close(self):
        self._pool.close()

0 replies

psymbio · 2024-02-08T05:44:08Z

psymbio
Feb 8, 2024
Author

So, my current code in the JupyterLite environment looks like:

import micropip
await micropip.install('https://raw.githubusercontent.com/psymbio/pyodide_wheels/main/multidict/multidict-4.7.6-py3-none-any.whl', keep_going=True)
await micropip.install('https://raw.githubusercontent.com/psymbio/pyodide_wheels/main/frozenlist/frozenlist-1.4.0-py3-none-any.whl', keep_going=True)
await micropip.install('https://raw.githubusercontent.com/psymbio/pyodide_wheels/main/aiohttp/aiohttp-3.9.1-py3-none-any.whl', keep_going=True)
await micropip.install('https://raw.githubusercontent.com/psymbio/pyodide_wheels/main/openai/openai-1.3.7-py3-none-any.whl', keep_going=True)
await micropip.install('https://raw.githubusercontent.com/psymbio/pyodide_wheels/main/urllib3/urllib3-2.1.0-py3-none-any.whl', keep_going=True)

await micropip.install("ssl")
import ssl
await micropip.install("httpx", keep_going=True)
import httpx
await micropip.install('https://raw.githubusercontent.com/psymbio/pyodide_wheels/main/urllib3/urllib3-2.1.0-py3-none-any.whl', keep_going=True)
import urllib3
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
import json

class URLLib3Stream(httpx.SyncByteStream):
    def __init__(self, stream):
        self._stream = stream

    def __iter__(self):
        with httpx._transports.default.map_httpcore_exceptions():
            for part in self._stream:
                yield part

    def close(self):
        if hasattr(self._stream, "close"):
            self._stream.close()


class URLLib3Transport(httpx.BaseTransport):
    def __init__(self):
        # self._pool = urllib3.PoolManager()
        self._pool = urllib3.connectionpool.ConnectionPool(host="localhost")

    def handle_request(self, request: httpx.Request):
        payload = json.loads(request.content.decode("utf-8").replace("'",'"'))
        urllib3_response = self._pool.request(
            method=request.method,
            url=str(request.url),
            headers=request.headers,
            stream=True,
        )
        headers = [(b"content-type", b"application/json")]
        return Response(
            status=200,
            headers=headers,
            stream=URLLib3Stream(urllib3_response.stream)
        )

client = httpx.Client(transport=URLLib3Transport())

from openai import OpenAI
from openai import AzureOpenAI
import openai
import os

os.environ['AZURE_OPENAI_API_KEY'] = "xxx"
openai_client = AzureOpenAI(
    api_version="2023-07-01-preview",
    azure_endpoint="https://xxx.openai.azure.com/",
    http_client=client
)

response = openai_client.chat.completions.with_raw_response.create(
    messages=[{
        "role": "user",
        "content": "sing me a song",
    }],
    model="gpt-35-turbo",
    max_tokens=30,
    temperature=0.7,
    # stream=True
)

completion = response.parse()
print(completion)

And get the following error: APIConnectionError: Connection error., complete error message.

And I've also tried another variation (because AttributeError: 'ConnectionPool' object has no attribute 'request'), but this didn't work out as well:

class URLLib3Stream(httpx.SyncByteStream):
    def __init__(self, stream):
        self._stream = stream

    def __iter__(self):
        with httpx._transports.default.map_httpcore_exceptions():
            for part in self._stream:
                yield part

    def close(self):
        if hasattr(self._stream, "close"):
            self._stream.close()

class URLLib3Transport(httpx.BaseTransport):
    def __init__(self):
        self._pool = urllib3.connectionpool.ConnectionPool(host="localhost")
        self._http = urllib3.PoolManager(self._pool)

    def handle_request(self, request: httpx.Request):
        payload = json.loads(request.content.decode("utf-8").replace("'",'"'))
        urllib3_request = self._http.request(
            method=request.method,
            url=str(request.url),
            headers=request.headers,
            body=payload,
            fields=None,
            stream=True,
        )
        headers = [(b"content-type", b"application/json")]
        return Response(
            status=200,
            headers=headers,
            stream=URLLib3Stream(urllib3_request.stream)
        )

client = httpx.Client(transport=URLLib3Transport())

This results in the error message: APIConnectionError: Connection error. and the complete error message.

0 replies

psymbio · 2024-02-21T08:56:00Z

psymbio
Feb 21, 2024
Author

Hi @tomchristie,

I rectified some more of the above code but have noticed some issues in the way that streaming works:

Particularly this part of the code:

class URLLib3Stream(httpx.SyncByteStream):
    def __init__(self, stream):
        self._stream = stream

    def __iter__(self):
        print(self._stream.__self__.data)
        with httpx._transports.default.map_httpcore_exceptions():
            for part in self._stream:
                yield part

    def close(self):
        if hasattr(self._stream, "close"):
            self._stream.close()

printing out the data in the before yielding the output give me:

b'{"id":"chatcmpl-8ucZ1UsLPxKy6SMYpE1JkJ4SsaSwT","object":"chat.completion","created":1708505299,"model":"gpt-35-turbo","prompt_filter_results":[{"prompt_index":0,"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}}}],"choices":[{"finish_reason":"stop","index":0,"message":{"role":"assistant","content":"I can\'t sing, but I can certainly help you find a song to listen to! What kind of music are you in the mood for?"},"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}}}],"usage":{"prompt_tokens":11,"completion_tokens":29,"total_tokens":40},"system_fingerprint":"fp_68a7d165bf"}\n'
b'{"id":"chatcmpl-8ucZ2tRwvhcyTgINUlrsZScHuWIXu","object":"chat.completion","created":1708505300,"model":"gpt-35-turbo","prompt_filter_results":[{"prompt_index":0,"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}}}],"choices":[{"finish_reason":"length","index":0,"message":{"role":"assistant","content":"I\'m just a digital assistant, so I can\'t sing, but I can certainly help you find the lyrics to a song or recommend some good tunes"},"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}}}],"usage":{"prompt_tokens":11,"completion_tokens":30,"total_tokens":41},"system_fingerprint":"fp_68a7d165bf"}\n'
b'{"id":"chatcmpl-8ucZ5to0cm2ZVFGtIDK9RcvuN3iRL","object":"chat.completion","created":1708505303,"model":"gpt-35-turbo","prompt_filter_results":[{"prompt_index":0,"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}}}],"choices":[{"finish_reason":"length","index":0,"message":{"role":"assistant","content":"I\'m just a virtual assistant, so I can\'t sing, but I can help you find the lyrics to a song or recommend some music for you"},"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}}}],"usage":{"prompt_tokens":11,"completion_tokens":30,"total_tokens":41},"system_fingerprint":"fp_68a7d165bf"}\n'

Which is three different responses for a single request.

Complete code can be found here.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Streaming response wrapping for custom transport layers #3078

{{title}}

Replies: 3 comments

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

Streaming response wrapping for custom transport layers #3078

psymbio Feb 1, 2024

Replies: 3 comments

tomchristie Feb 1, 2024 Maintainer

psymbio Feb 8, 2024 Author

psymbio Feb 21, 2024 Author

psymbio
Feb 1, 2024

tomchristie
Feb 1, 2024
Maintainer

psymbio
Feb 8, 2024
Author

psymbio
Feb 21, 2024
Author