From 91de6c972e8c4ccd70af13836522f7fde219c1d8 Mon Sep 17 00:00:00 2001 From: Steven Allen Date: Fri, 12 Jun 2020 12:13:29 -0700 Subject: [PATCH] doc(prod): start documenting production stuff Hopefully, we can get the community to start pooling knowledge of how to run go-ipfs in production. --- docs/production/reverse-proxy.md | 93 ++++++++++++++++++++++++++++++++ 1 file changed, 93 insertions(+) create mode 100644 docs/production/reverse-proxy.md diff --git a/docs/production/reverse-proxy.md b/docs/production/reverse-proxy.md new file mode 100644 index 00000000000..dfb5359fa50 --- /dev/null +++ b/docs/production/reverse-proxy.md @@ -0,0 +1,93 @@ +# IPFS & Reverse HTTP Proxies + +When run in production environments, go-ipfs should generally be run behind a +reverse HTTP proxy (usually NGINX). You may need a reverse proxy to: + +* Load balance requests across multiple go-ipfs daemons. +* Cache responses. +* Buffer requests, only releasing them to go-ipfs when complete. This can help + protect go-ipfs from the + [slowloris](https://en.wikipedia.org/wiki/Slowloris_(computer_security) + attack. +* Block content. +* Rate limit and timeout requests. +* Apply QoS rules (e.g., prioritize traffic for certain important IPFS resources). +* Expose a limited subset of the HTTP API. + +This document contains a collection of tips, tricks, and pitfalls when running a +go-ipfs node behind a reverse HTTP proxy. + +## Peering + +Go-ipfs gateways behind a single load balancing reverse proxy should use the +[peering](../config.md#peering) subsystem to peer with each other. That way, as +long as one go-ipfs daemon has the content being requested, the others will be +able to serve it. + +# Garbage Collection + +Gateways rarely store content permanently. However, running garbage collection +can slow down a go-ipfs node significantly. If you've noticed this issue in +production, consider "garbage collecting" by resetting the go-ipfs repo whenever +you run out of space, instead of garbage collecting. + +1. Initialize your gateways repo to some known-good state (possibly pre-seeding + it with some content, a config, etc.). +2. When you start running low on space, for each load-balanced go-ipfs node: + 1. Use the nginx API to set one of the upstream go-ipfs node's to "down". + 2. Wait a minute to let go-ipfs finish processing any in-progress requests + (or the short-lived ones, at least). + 3. Take the go-ipfs node down. + 4. Rollback the go-ipfs repo to the seed state. + 5. Restart the go-ipfs daemon. + 6. Update the nginx config, removing the "down" status from the node. + +This will effectively "garbage collect" without actually running the garbage +collector. + +# Buffering Requests & Responses + +In general, requests to the gateway should be buffered by the reverse proxy for +the best performance. This is usually enabled by default (`proxy_request_buffering`). + +## API + +The go-ipfs HTTP API (`/api/...`) starts sending a response before it's done +reading the request. This allows it to, e.g., send back progress updates while +adding a file to go-ipfs. + +However, these progress updates won't work if the HTTP reverse proxy is +configured to buffer requests. While requests to the go-ipfs _gateway_ should +usually be buffered for better performance, requests to the go-ipfs API should +generally not be buffered. + +In NGINX, you can turn off buffering for the API with: + +```nginx +server { + ... + location /api { + ... + proxy_request_buffering off; + proxy_buffering off; + proxy_http_version 1.1; + } +} +``` + +See: https://github.com/ipfs/go-ipfs/issues/6402#issuecomment-643025868 + +# Content Blocking + +TODO: + +* Filtering requests +* Checking the X-IPFS-Path header in responses to filter again after resolving. + +# Subdomain Gateway + +TODO: Reverse proxies and the subdomain gateway. + +# Load balancing + +TODO: discuss load balancing based on the CID versus the source IP.