- Restrict pycurl version to <7.43.0.1 (see #354)
- Fix #346: spider does not process
initial_urls
- Fix #344: raise GrabInvalidUrl for pycurl error #3
- Fix bug: task generator works incorrectly
- Fix bug: pypi package misses http api html file
- Fix bug: dictionary changed size during iteration in stat logging
- Fix bug: multiple errors in urllib3 transport and threaded network service
- Fix short names of errors in stat logging
- Improve error handling in urrllib3 transport
- Fix #299: multi-added errors
- Fix bug: pypi package misses http api html file
- Fix #285: pyquery extension parses html incorrectly
- Fix #267: normalize handling of too many redirect error
- Fix #268: fix processing of utf cookies
- Fix #241: form_fields() fails on some HTML forms
- Fix normalize_unicode issue in debug post method
- Fix #323: urllib3 transport fails with UnicodeError on some invalid URLs
- Fix #31: support for multivalue form inputs
- Fix #328, fix #67: remove hard link between document and grab
- Fix #284: option headers affects content of common_headers
- Fix #293: processing non-latin chars in Location header
- Fix #324: refactor response header processing
- Refactor Spider into set of async. services
- Add certifi dependency into grab[full] setup target
- Fix #315: use psycopg2-binary package for postgres cache
- Related to #206: do not use connection_reuse=False for proxy connections in spider
- Remove cache timeout option
- Remove structured extension
- Fix "error:None" in spider rps logging
- Fix race condition bug in task generator
- Add original_exc attribute to GrabNetworkError (and subclasses) that points to original exception
- Remove IOError from the ancestors of GrabNetworkError
- Add default values to --spider-transport and --grab-transport options of crawl script
- Add --spider-transport and --grab-transport options to crawl script
- Add SOCKS5 proxy support in urllib3 transport
- Fix #237: urllib3 transport fails without pycurl installed
- Fix bug: incorrect spider request logging when cache is enabled
- Fix bug: crawl script fails while trying to process a lock key
- Fix bug: urllib3 transport fails while trying to throw GrabConnectionError exception
- Fix bug: Spider add_task method fails while trying to log invalid URL error
- Remove obsoleted hammer_mode and hammer_timeout config options
- Add pylint to default test set
- Fix #229: using deprecated response object inside Grab
- Remove spider project template and start_project script
- Fix bug in deprecated grab.choose_form method
- Add default project templates files to the distribution, by @rushter
- Fix #222: debug_post option fails with big post data
- Fix #148: pycurl ignores sigint signal
- Start running Grab tests in OSX environment on travis CI
- Use defusedxml library to parse HTML and XML, by @kevinlondon
- Put selection, lxml and pycurl libs back to required dependencies in setup.py
- Update installation documentation
- Add API documentation about few grab modules, by @rushter
- Start running Grab tests in Windows environment on appveyor CI
- New spider transport based on threads that allows to use Spider with any Grab network backend e.g. urllib3
- Add
remove_from_post
option to grab.doc.submit method - Add
random
option to grab.change_proxy method - Support for deprecated attributes Spider.items and Spider.counters
- If Spider handler raises ResponseNotValid exception, then that task goes back to task queue until task.task_try_count reaches the spider.task_try_limit
- Refactor management of internal threads, fix random test failures related to cache sub-module
- Disable default logging to files while running spider by
run crawl
command - Multiple improvements in urllib3 transport
- Set default spider network & try limits to 3 (was 10)
- Different bugs in urllib3 transport
- Different bugs
- Remove grab.use_next_proxy method
- Remove grab.dump method
- Remove deprecated Spider methods and attributes
- Fix setup.py
- Everything :) Probably later I'll extend changelog more deep to the history