Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite AsyncHttp\Client for cleaner API and Transfer-Encoding support #113

Merged
merged 25 commits into from
Jul 15, 2024

Conversation

adamziel
Copy link
Collaborator

@adamziel adamziel commented Jul 14, 2024

Refactors the AsyncHttp\Client to simplify the usage and the internal implementation. This will be helpful for rewriting URLs in WordPress posts and downloading the related assets.

As a reminder, AsyncHttp\Client is a PHP HTTP client that can do asynchronous processing of multiple requests without curl or any other dependencies.

Changes

  • Handle errors at each step of the HTTP request lifecycle.
  • Drops support for PHP 7.0 and 7.1 since WordPress is dropping that support, too.
  • Provide await_next_event() as a single, filterable interface for consuming all the HTTP activity. Remove the onProgress callback and various other ways of waiting for information on specific requests.
  • Introduce an internal event_loop_tick() function that runs all the available non-blocking operations.
  • Move all the logic from functions into the Client class. It is now less generic, but I'd argue it already wasn't that generic and at least now we can avoid going back and froth between functions and that class.
  • Support Transfer-Encoding: chunked, Transfer-Encoding: gzip, and Content-Encoding: gzip via stream wrappers.
  • Remove most of the complexity associated with making PHP streams central to how the library works. In this version, the focus is on the Client object so we no longer have to go out of our way to store data in stream context, struggle with stream filters, passthrough data between stream wrappers layers etc.

This PR also ships an implementation of a HTTP proxy built with this client library – it could come handy for running an in-browser Git client:

https://github.com/WordPress/blueprints-library/blob/http-client-api-refactir/http_proxy.php

Usage example

$requests = [
	new Request( "https://wordpress.org/latest.zip" ),
	new Request( "https://raw.githubusercontent.com/wpaccessibility/a11y-theme-unit-test/master/a11y-theme-unit-test-data.xml" ),
];

$client = new Client( [
    'concurrency' => 10
] );
$client->enqueue( $requests );

while ( $client->await_next_event() ) {
	$request = $client->get_request();
	echo "Request " . $request->id . ": " . $client->get_event() . " ";
	switch ( $client->get_event() ) {
		case Client::EVENT_BODY_CHUNK_AVAILABLE:
			echo $request->response->received_bytes . "/". $request->response->total_bytes ." bytes received";
			file_put_contents( 'downloads/' . $request->id, $client->get_response_body_chunk(), FILE_APPEND);
			break;
		case Client::EVENT_REDIRECT:
		case Client::EVENT_GOT_HEADERS:
		case Client::EVENT_FINISHED:
			break;
		case Client::EVENT_FAILED:
			echo "– ❌ Failed request to " . $request->url . "" . $request->error;
			break;
	}
	echo "\n";
}

HTTP Proxy example

// Encode the current request details in a Request object
$requests = [
	new Request(
		$target_url,
		[
			'method' => $_SERVER['REQUEST_METHOD'],
			'headers' => [
				...getallheaders(),
				// Ensure we won't receive an unsupported content encoding
				// just because the client browser supports it.
				'Accept-Encoding' => 'gzip, deflate',
				'Host' => parse_url($target_url, PHP_URL_HOST),
			],
			// Naively assume only POST requests have body
			'body_stream' => $_SERVER['REQUEST_METHOD'] === 'POST' ? fopen('php://input', 'r') : null,
		]
	),
];

$client = new Client();
$client->enqueue( $requests );

$headers_sent = false;
while ( $client->await_next_event() ) {
    // Pass the response headers and body to the client,
    // Consult the previous example for the details.
}

Future work

  • Unit tests.
  • Abundant inline documentation with examples and explanation of technical decisions.
  • Standard way of piping HTTP responses into ZIP processor, XML processor, HTML tag processor etc.
  • Find a useful way of treating HTTP error codes such as 404 or 501. Currently these requests are marked as "finished", not "failed", because the connection was successfully created and the server replied with a valid HTTP response. Perhaps it's fine not to do that. This could be a lower-level library and that behavior could belong to a higher-level client.

cc @dmsnell @MayPaw @reimic

@adamziel adamziel force-pushed the http-client-api-refactir branch from ef41a68 to 2cda531 Compare July 14, 2024 14:39
@adamziel adamziel changed the title Refactor AsyncHttp\Client API Rewrite AsyncHttp\Client for cleaner API and Transfer-Encoding support Jul 15, 2024
@adamziel adamziel changed the base branch from support-transfer-encoding to trunk July 15, 2024 08:33
@adamziel adamziel marked this pull request as ready for review July 15, 2024 08:34
@adamziel adamziel self-assigned this Jul 15, 2024
@adamziel adamziel merged commit 9a26c5e into trunk Jul 15, 2024
21 checks passed
@adamziel adamziel deleted the http-client-api-refactir branch July 15, 2024 12:53
adamziel added a commit to adamziel/site-transfer-protocol that referenced this pull request Jul 15, 2024
The code got much simpler plus we can easily expand to stream-rewriting remote
pages or even zip archives (with the ZIP Processor).

See WordPress/blueprints-library#113
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

1 participant