-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: update ChromeDriver options on restricted environments and add ChromeDriver options as function parameter #3043
fix: update ChromeDriver options on restricted environments and add ChromeDriver options as function parameter #3043
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, I added some minor language comments to comply with our writing guidelines. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
@danielbichuetti This is looking good to me! Although I think we should have someone from the core-engineering team for the final approval. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thank you!
Related Issues
Proposed Changes:
webdriver_options
parameter, so the user can set the parameters that his specific scenario demands--disable-dev-shm-usage
--no-sandbox
--disable-gpu
There is no need to enable GPU acceleration on a text crawler. GPU is expensive in the cloud.--disable-dev-shm-usage
Usually in container environments there is no access to shared memory, or it's set with the default size of 64 MB. Disabling its usage, Chrome will write to a temporary directory. Using shared memory may improve performance just in high testing workloads, which is not our use case.-
--single-process
As haystack doesn't support multi-tabs, there is no point to let Chrome spawn multiple processes. It will just increase the memory footprint.How did you test it?
Notes for the reviewer
This changes essentially will make haystack Crawler node able to run in multiple environments without any user extra effort. In case any user has more specific needs, it will be possible using the parameter.
Checklist