-
-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revisit check-url behavior and provide User-Agent a custom default value #229
Conversation
8484df1
to
8907a71
Compare
…ion automatically
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good but I think custom UA should apply to the crawler even when using --mobileDevice. I don't see a use case for specifying it only for our python check.
Is that a crawler limitation?
No, it's not a crawler limitation, it is something I even had to "force" somehow with specific code. |
Mostly the apparent device sizes (resolution). That's used mostly to crawl mobile versions of website or to have the appropriate media queries when using it for screenshots. |
I have adapted the code to always use the UA, even when mobile device is set. I updated the first comment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM ; please split the long line
I split the line, I let you merge if everything is OK. |
Fix #228
Fix #227
Fix #230
Changes:
--userAgent
(always has a value due to default) + optional--userAgentSuffix
+ optional--adminEmail
; these 2 last values are automatically prefixed by a space, no need to provide one (or if someone does, he will have two spaces which is not an issue)Except when a--mobileDevice
is passed, then UA is only used for Python requestsIf both--mobileDevice
and--userAgent
are passed, a warning is displayed--mobileDevice
is passedGET
instead of aHEAD
NOTA: this won't be totally functional until webrecorder/browsertrix-crawler#419 is fixed (once it is fixed and used by us, there is no change needed except adapt the test case which has been made a little too permissive due to this bug)