Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

random proxy支持 #9

Closed
foolcage opened this issue Mar 21, 2018 · 0 comments
Closed

random proxy支持 #9

foolcage opened this issue Mar 21, 2018 · 0 comments

Comments

@foolcage
Copy link
Owner

目前实现:
抓取代理
检查代理速度
生成可用代理列表

只需要在需要使用代理的爬虫的 yield request的函数前面加上@random_proxy

    # 如果需要代理请打开
    # @random_proxy
    def yield_request(self, item, start_date=None, end_date=None):
        data_path = get_kdata_path(item, source='163')

        if start_date:
            start = start_date.strftime('%Y%m%d')
        else:
            start = item['listDate'].replace('-', '')

        if end_date:
            end = end_date.strftime('%Y%m%d')
        else:
            end = datetime.today().strftime('%Y%m%d')

        if not os.path.exists(data_path) or start_date or end_date:
            if item['exchange'] == 'sh':
                exchange_flag = 0
            else:
                exchange_flag = 1
            url = self.get_k_data_url(exchange_flag, item['code'], start, end)
            yield Request(url=url, meta={'path': data_path, 'item': item},
                          callback=self.download_day_k_data)

经测试,抓取稳定且速度还不错.

大家可以自己去爬一些代理或者手动添加自己买的代理,只要符合以下格式即可:
proxy_contract

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant