You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Have you got any other ideas on how to cut down on duplicates without using (Last-Modified/If-Modified-Since/ETag/If-None-Match). I'm trying to use the web.py source but some HTTP response headers don't the above tags.
I was thinking of creating a shasum of the content of a page and saving it as the saved_state and checking it later if there are any new items. However this would only work if you are scraping one page.
The text was updated successfully, but these errors were encountered:
Hi guys,
Have you got any other ideas on how to cut down on duplicates without using
(Last-Modified/If-Modified-Since/ETag/If-None-Match)
. I'm trying to use theweb.py
source but some HTTP response headers don't the above tags.I was thinking of creating a shasum of the content of a page and saving it as the saved_state and checking it later if there are any new items. However this would only work if you are scraping one page.
The text was updated successfully, but these errors were encountered: