Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: worker ready injection interface #105

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

petrpatek
Copy link
Contributor

This feature is necessary for the worker fingerprint injection. I started playing with the worker injection when bypassing the Kasada protection. I have tried two methods of injection. First, using the playwright on worker events - resulted in having the fingerprint injected late. The second one was evaluating the injection script at the begging of the worker script - This worked well when the worker is not a phantom worker, i.e., not loading self. But for more testing and playing around, this interface would be extremely helpful.

@petrpatek petrpatek changed the title Kicked off navigator injection feat: worker ready injection interface Nov 23, 2022
@barjin
Copy link
Collaborator

barjin commented Nov 23, 2022

Make sure to check #64 :) There are basically more ways of worker injection that I and the guys in the thread could think of, but all of them have their caveats:

  • Routing the network requests (POC) - works for classic Workers, but not for Service- and SharedWorkers (but might be doable now, needs more research). Also, .route() disables caching, which might lead to performance drops.
  • Constructor overrides with base64 (comment) - apparently, worker constructors support base64 encoded strings as the input code. We could support this by overriding the Shared/Service/Worker constructor and prepending our code into the base string. However, this is probably not doable with classic "path" inputs (but could be an interesting backdoor?)
  • @piercefreeman's smart proxy (comment) - this would probably clash with Crawlee's proxy management, but maybe worth a try?

Either way, I'd be super happy to finally close this one, so if you come up with something, please lmk :)

@petrpatek
Copy link
Contributor Author

Yes, I checked it. These are great generic ideas. But this way, we can move one step closer to experimenting on per website/protection basis. As I understand, all these methods need a js to be injected into the worker that overrides the navigator.

@barjin
Copy link
Collaborator

barjin commented Nov 24, 2022

Oh, so the goal w/ this PR is to have an interface for worker-ready injectable script and handle the injection in the user code for now, do I get this right?

Either way, it works for me, we can release this as a prerelease so we can test it better :)

@Failton
Copy link

Failton commented Jan 29, 2023

Any progress?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants