hashicorp · kaitlincart · Feb 21, 2019 · Feb 19, 2019 · Feb 19, 2019 · Feb 19, 2019
diff --git a/website/source/api/index.html.md b/website/source/api/index.html.md
@@ -77,6 +77,47 @@ to the supplied maximum `wait` time to spread out the wake up time of any
 concurrent requests. This adds up to `wait / 16` additional time to the maximum
 duration.
 
+### Implementation Details
+
+While the mechanim is relatively simple to work with, there are a few subtelties
+that a robust client must observe in order to not behave badly in edge cases.
+ * **Reset the index if it goes backwards**. While indexes in general are 
+   monotonically increasing, there are several real-world scenarios in 
+   which they can go backwards for a given query. Implementations must check 
+   to see if a returned index is lower than the previous value, 
+   and if it is, should reset index to `0` - effectively restarting their blocking loop. 
+   Failure to do so may cause the client to miss future updates for an unbounded 
+   time, or to use an invalid index value that causes no blocking and increases 
+   load on the servers. Cases where this can occur include:
+   * If a raft snapshot is restored on the servers with older version of the data
+   * KV list operations where an item with the highest index is removed
+   * A consul upgrade changes the way watches work to optimise them with more 
+   granular indexes.
+ * **Sanity check index is greater than zero**. After the initial request (or a
+   reset as above) the `X-Consul-Index` returned _should_ always be greater than zero. It
+   is a bug in Consul if it is not, however this has happened a few times and can
+   still be triggered on some older Consul versions. It's especially bad because it
+   causes blocking clients that are not aware to enter a busy loop, using excessive 
+   client CPU and causing high load on servers. It is _always_ safe to use an 
+   index of `1` to wait for updates when the data being requested doesn't exist
+   yet, so clients _should_ sanity check that their index is at least 1 after 
+   each blocking response is handled to be sure they actually block on the next 
+   request.
+ * **Rate limit**. The blocking query mechanism is reasonably efficient when updates 
+ are relatively rare (order of tens of seconds to minutes between updates). In cases 
+ where a result gets updated very fast however - possibly during an outage or incident 
+ with a badly behaved client - blocking query loops degrade into busy loops that 
+ consume execessive client CPU and causing high server load. While it's possible to just add a sleep 
+ to every iteration of the loop, this is **not** recommended since it causes update 
+ delivery to be delayed in the happy case, and it can exacerbate the problem since 
+ it increases the chance that the index has changed on the next request. Clients 
+ _should_ instead rate limit the loop so that in the happy case they proceed without 
+ waiting, but when values start to churn quickly they degrade into polling at a 
+ reasonable rate (say every 15 seconds). Ideally this is done with an algorithm that 
+ allows a couple of quick successive deliveries before it starts to limit rate - a 
+ [token bucket](https://en.wikipedia.org/wiki/Token_bucket) with burst of 2 is a simple
+ way to acheive this.
+
 ### Hash-based Blocking Queries
 
 A limited number of agent endpoints also support blocking however because the