Skip to content

A httplib2 fork that returns streams, not strings, useful for very large GET body.

Notifications You must be signed in to change notification settings

madlag/streaming_httplib2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Httplib2

--------------------------------------------------------------------
This is a modified version of the original httplib2 library
to support streaming of large http responses, instead of loading them 
into memory as in the original library.
See CHANGELOG for more information.

The package can be installed throught pip:

 pip install streaming_httplib2

A distributed cache class is available in dcache.py
You can use it to create multi-process or even multi-machines 
web cache (using distributed file systems like GlusterFS).

--------------------------------------------------------------------
Introduction

A comprehensive HTTP client library, httplib2.py supports many 
features left out of other HTTP libraries.

HTTP and HTTPS
    HTTPS support is only available if the socket module was 
    compiled with SSL support. 
Keep-Alive
    Supports HTTP 1.1 Keep-Alive, keeping the socket open and 
    performing multiple requests over the same connection if 
    possible. 
Authentication
    The following three types of HTTP Authentication are 
    supported. These can be used over both HTTP and HTTPS.

        * Digest
        * Basic
        * WSSE

Caching
    The module can optionally operate with a private cache that 
    understands the Cache-Control: header and uses both the ETag 
    and Last-Modified cache validators. 
All Methods
    The module can handle any HTTP request method, not just GET 
    and POST.
Redirects
    Automatically follows 3XX redirects on GETs.
Compression
    Handles both 'deflate' and 'gzip' types of compression.
Lost update support
    Automatically adds back ETags into PUT requests to resources 
    we have already cached. This implements Section 3.2 of 
    Detecting the Lost Update Problem Using Unreserved Checkout.
Unit Tested
    A large and growing set of unit tests. 


For more information on this module, see:

    http://bitworking.org/projects/httplib2/


--------------------------------------------------------------------
Installation

The httplib2 module is shipped as a distutils package.  To install
the library, unpack the distribution archive, and issue the following
command:

    $ python setup.py install


--------------------------------------------------------------------
Usage
A simple retrieval:

  import httplib2
  h = httplib2.Http(".cache")
  (resp_headers, content) = h.request("http://example.org/", "GET")

The 'content' is the content retrieved from the URL. The content 
is already decompressed or unzipped if necessary.

To PUT some content to a server that uses SSL and Basic authentication:

  import httplib2
  h = httplib2.Http(".cache")
  h.add_credentials('name', 'password')
  (resp, content) = h.request("https://example.org/chapter/2", 
                            "PUT", body="This is text", 
                            headers={'content-type':'text/plain'} )

Use the Cache-Control: header to control how the caching operates.

  import httplib2
  h = httplib2.Http(".cache")
  (resp, content) = h.request("http://bitworking.org/", "GET")
  ...
  (resp, content) = h.request("http://bitworking.org/", "GET", 
                            headers={'cache-control':'no-cache'})

The first request will be cached and since this is a request 
to bitworking.org it will be set to be cached for two hours, 
because that is how I have my server configured. Any subsequent 
GET to that URI will return the value from the on-disk cache 
and no request will be made to the server. You can use the 
Cache-Control: header to change the caches behavior and in 
this example the second request adds the Cache-Control: 
header with a value of 'no-cache' which tells the library 
that the cached copy must not be used when handling this request. 


--------------------------------------------------------------------
Httplib2 Software License

Copyright (c) 2006 by Joe Gregorio

Permission is hereby granted, free of charge, to any person 
obtaining a copy of this software and associated documentation 
files (the "Software"), to deal in the Software without restriction, 
including without limitation the rights to use, copy, modify, merge, 
publish, distribute, sublicense, and/or sell copies of the Software, 
and to permit persons to whom the Software is furnished to do so, 
subject to the following conditions:

The above copyright notice and this permission notice shall be 
included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES 
OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS 
BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN 
ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 
CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 
SOFTWARE.

About

A httplib2 fork that returns streams, not strings, useful for very large GET body.

Resources

Stars

Watchers

Forks

Packages

No packages published