Cloudflare Anti-Bot HTTP Proxy

A basic HTTP proxy in Python, implementing VeNoMouS' cloudscraper module.

Installation

Clone this repo, cd into the repo folder and then use setup.py to install:

sudo python setup.py install

Optionally, install python-prctl. This will alow the proxy to change its process name as displayed in ps, netstat and other such software to make it easier to see what the proxy is doing:

sudo pip install python-prctl

Note: python-prctl requires gcc, libc development headers and libcap development headers to be installed in the operating system. See the python-prctl docs for details.

If you're installing this on a router and are using Entware you may need to install some Python modules with opkg, for example:

opkg install python-requests
opkg install node_legacy

Compatibility

The proxy should run on most linux distros and has been tested on:

Ubuntu Bionic
Armbian (Debian) Stretch
ASUSWRT-Merlin / Entware

The proxy won't run on Windows due to differences in handling network interfaces and sockets.

Usage

usage: cfscrape-http-proxy [-h] [-c FILE] [--debug] [-l INTERFACE] [-p PORT]
                           [-e INTERFACE] [-D] [-P FILE]

CloudFlare Anti-Bot HTTP Proxy

optional arguments:
  -h, --help            show this help message and exit
  -c FILE, --config_file FILE
                        external configuration file
  --debug               turn on debug messaging
  -l INTERFACE, --listen_if INTERFACE
                        interface to listen to (use 'any' or '0' to listen to all interfaces) (default 'any')
  -p PORT, --listen_port PORT
                        port to listen on (default 8080)
  -e INTERFACE, --exit_if INTERFACE
                        interface for outgoing connections (default <default>)
  -D, --daemonize       run the proxy as a daemon
  -P FILE, --pidfile FILE
                        PID file for daemon (default '/tmp/cfscrape-http-proxy.pid')
  -R, --no_redirect     turn off redirects, prevent adding missing elements to URLs in the browser

Command line parameters override default settings and any settings from a configuration file. An external configuration file specified on the command line will override a configuration file installed in the system config path (typically /etc/, or possibly /opt/etc/ by default).

Once the proxy is running it can be used by accessing it via HTTP with the page you wish to proxy included as the url GET variable:

http://<proxy_server_IP>:<proxy_port>/?url=<target_page>

Example: http://192.168.0.10:8080/?url=www.news.com.au/content-feeds/latest-news-world/

Configuration

The config file (called cfscrape-http-proxy.conf by default, but can be changed via the command line) uses the same names for paramaters as the command line. As an example, a configuration file for use on a router with distinct LAN (br0) and WAN (eth0) interfaces:

[Defaults]
listen_if=br0
port=8080
exit_if=eth0
daemoinze=True
pidfile=/tmp/cfscrape-http-proxy.pid
no_redirect=False

The proxy will search several paths for a configuration file if none is specified on the command line. If you're running Entware it should be at /opt/etc/cfscrape-http-proxy.conf, otherwise it should probably be at /etc/cfscrape-http-proxy.conf.

The default exit interface (shown as <default> in the usage above), used if nothing is specified in the config file or on the command line, will vary from one system to another. The proxy determines the default by looking at the routing table.

The no_redirect flag will make the proxy faster by not sending HTTP 301 responses to add ?url=http://<domain>/ to the beginning of requests that don't include it (i.e. relative URLs in links on a web page), but it may make browsing web pages through the proxy buggier. It's a trade between speed and robustness, you can experiment to see which you prefer.

Notes

Installing with pip (sudo pip install .) ends up putting the config and init files inside Python's site-packages path rather than the root path. This is apparently because it's installing from a bdist, but also because I don't really know how to work setuptools properly. :)

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
init		init
.gitignore		.gitignore
README.md		README.md
cfscrape-http-proxy.conf		cfscrape-http-proxy.conf
cfscrape-http-proxy.py		cfscrape-http-proxy.py
optional-requirements.txt		optional-requirements.txt
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cloudflare Anti-Bot HTTP Proxy

Installation

Compatibility

Usage

Configuration

Notes

About

Releases

Packages

Languages

moonbuggy/cfscrape-http-proxy

Folders and files

Latest commit

History

Repository files navigation

Cloudflare Anti-Bot HTTP Proxy

Installation

Compatibility

Usage

Configuration

Notes

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages