This repository has been archived by the owner on Jun 10, 2024. It is now read-only.
v0.3.4
Global
- New message queue support: beanstalkd by @tiancheng91
- New global argument:
--logging-config
to specify a customization logging config (to disable werkzeug logs for instance). You can get a sample config from pyspider/logging.conf). - Project
group
info is added to task package now. - Change docker base image to cmfatih/phantomjs, you can use phantomjs with same docker image now.
- Auto restart phantomjs if crash, only enabled in all mode by default.
WebUI
- Show next
exetime
of a task in task page. - Show fetch time and process time in tasks page.
- Show average fetch time and process time in 5min in dashboard page.
- Show message queue status in dashboard page.
limit
andoffset
parameter support in result dump.- Fix frontend bug when crawling pages with dataurl.
Other
- Fix support for phantomjs 2.0.
- Fix scheduler project update inform not work, and use md5sum of script as another signal.
- Scheduler: periodic counter report in log.
- Fetcher: fix for legacy version of pycurl