Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add assertion error description for easier debugging #389

Merged
merged 1 commit into from
Nov 29, 2019

Conversation

a-shkarupin
Copy link
Contributor

In case SPIDER_FEED_PARTITIONS setting is set to a value that doesn't match the number of partitions of the spider feed topic in kafka, the following error is reported on the crawler side:

2019-11-29 11:03:22 [kafka.producer.kafka] INFO: Closing the Kafka producer with 0 secs timeout.
2019-11-29 11:03:22 [kafka.producer.kafka] INFO: Proceeding to force close the producer since pending requests could not be completed within timeout 0.
2019-11-29 11:03:22 [kafka.producer.sender] DEBUG: Beginning shutdown of Kafka producer I/O thread, sending remaining records.
2019-11-29 11:03:22 [kafka.conn] INFO: <BrokerConnection node_id=bootstrap-0 host=localhost:9092 [IPv4 ('127.0.0.1', 9092)]>: Closing connection.
2019-11-29 11:03:22 [kafka.producer.sender] DEBUG: Shutdown of Kafka producer I/O thread has completed.
2019-11-29 11:03:22 [kafka.producer.kafka] DEBUG: The Kafka producer has closed.
2019-11-29 11:03:22 [kafka.producer.kafka] INFO: Closing the Kafka producer with 0 secs timeout.
2019-11-29 11:03:22 [kafka.producer.kafka] INFO: Proceeding to force close the producer since pending requests could not be completed within timeout 0.
2019-11-29 11:03:22 [kafka.producer.sender] DEBUG: Beginning shutdown of Kafka producer I/O thread, sending remaining records.
2019-11-29 11:03:22 [kafka.conn] INFO: <BrokerConnection node_id=bootstrap-5 host=localhost:9092 [IPv4 ('127.0.0.1', 9092)]>: Closing connection.
2019-11-29 11:03:22 [kafka.producer.sender] DEBUG: Shutdown of Kafka producer I/O thread has completed.
2019-11-29 11:03:22 [kafka.producer.kafka] DEBUG: The Kafka producer has closed.
Unhandled error in Deferred:
2019-11-29 11:03:22 [twisted] CRITICAL: Unhandled error in Deferred:

Traceback (most recent call last):
File "/home/a/venvs/frontera/lib/python3.6/site-packages/scrapy/crawler.py", line 184, in crawl
return self._crawl(crawler, *args, **kwargs)
File "/home/a/venvs/frontera/lib/python3.6/site-packages/scrapy/crawler.py", line 188, in _crawl
d = crawler.crawl(*args, **kwargs)
File "/home/a/venvs/frontera/lib/python3.6/site-packages/twisted/internet/defer.py", line 1613, in unwindGenerator
return _cancellableInlineCallbacks(gen)
File "/home/a/venvs/frontera/lib/python3.6/site-packages/twisted/internet/defer.py", line 1529, in _cancellableInlineCallbacks
_inlineCallbacks(None, g, status)
--- ---
File "/home/a/venvs/frontera/lib/python3.6/site-packages/twisted/internet/defer.py", line 1418, in _inlineCallbacks
result = g.send(result)
File "/home/a/venvs/frontera/lib/python3.6/site-packages/scrapy/crawler.py", line 88, in crawl
yield self.engine.open_spider(self.spider, start_requests)
builtins.AssertionError:

2019-11-29 11:03:22 [twisted] CRITICAL:
Traceback (most recent call last):
File "/home/a/venvs/frontera/lib/python3.6/site-packages/twisted/internet/defer.py", line 1418, in _inlineCallbacks
result = g.send(result)
File "/home/a/venvs/frontera/lib/python3.6/site-packages/scrapy/crawler.py", line 88, in crawl
yield self.engine.open_spider(self.spider, start_requests)
AssertionError

It was hard to understand what the cause was. With suggested patch the assertion description is provided in the error, thus making it easier to understand:

2019-11-29 11:04:21 [twisted] CRITICAL:
Traceback (most recent call last):
File "/home/a/venvs/frontera/lib/python3.6/site-packages/twisted/internet/defer.py", line 1418, in _inlineCallbacks
result = g.send(result)
File "/home/a/venvs/frontera/lib/python3.6/site-packages/scrapy/crawler.py", line 88, in crawl
yield self.engine.open_spider(self.spider, start_requests)
AssertionError: Number of kafka partitions doesn't match config for spider feed

@sibiryakov
Copy link
Member

thank you!

@sibiryakov sibiryakov merged commit 84f9e10 into scrapinghub:master Nov 29, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants