Inspect HEAD/LIST/GET requests within Rasterio
Source Code: https://github.com/developmentseed/tilebench
Inspect HEAD/GET requests withing Rasterio.
Note: In GDAL 3.2, logging capabilities for /vsicurl, /vsis3 and the like was added (ref: OSGeo/gdal#2742).
You can install tilebench
using pip
$ python -m pip install -U pip
$ python -m pip install -U tilebench
or install from source:
git clone https://github.com/developmentseed/tilebench.git
cd tilebench
python -m pip install -U pip
python -m pip install -e .
from tilebench import profile
import rasterio
@profile()
def info(src_path: str):
with rasterio.open(src_path) as src_dst:
return src_dst.meta
meta = info("https://noaa-eri-pds.s3.amazonaws.com/2022_Hurricane_Ian/20221002a_RGB/20221002aC0795145w325100n.tif")
> 2023-10-18T23:00:11.184745+0200 | TILEBENCH | {"HEAD": {"count": 1}, "GET": {"count": 1, "bytes": 32768, "ranges": ["0-32767"]}, "Timing": 0.7379939556121826}
from tilebench import profile
from rio_tiler.io import Reader
@profile()
def _read_tile(src_path: str, x: int, y: int, z: int, tilesize: int = 256):
with Reader(src_path) as cog:
return cog.tile(x, y, z, tilesize=tilesize)
img = _read_tile(
"https://noaa-eri-pds.s3.amazonaws.com/2022_Hurricane_Ian/20221002a_RGB/20221002aC0795145w325100n.tif",
9114,
13216,
15,
)
> 2023-10-18T23:01:00.572263+0200 | TILEBENCH | {"HEAD": {"count": 1}, "GET": {"count": 2, "bytes": 409600, "ranges": ["0-32767", "32768-409599"]}, "Timing": 1.0749869346618652}
$ tilebench --help
Usage: tilebench [OPTIONS] COMMAND [ARGS]...
Command line interface for the tilebench Python package.
Options:
--help Show this message and exit.
Commands:
get-zooms Get Mercator Zoom levels.
profile Profile COGReader Mercator Tile read.
random Get random tile.
viz WEB UI to visualize VSI statistics for a web mercator tile request
$ tilebench get-zooms https://noaa-eri-pds.s3.amazonaws.com/2022_Hurricane_Ian/20221002a_RGB/20221002aC0795145w325100n.tif | jq
{
"minzoom": 14,
"maxzoom": 19
}
$ tilebench random https://noaa-eri-pds.s3.amazonaws.com/2022_Hurricane_Ian/20221002a_RGB/20221002aC0795145w325100n.tif --zoom 15
15-9114-13215
$ tilebench profile https://noaa-eri-pds.s3.amazonaws.com/2022_Hurricane_Ian/20221002a_RGB/20221002aC0795145w325100n.tif --tile 15-9114-13215 --config GDAL_DISABLE_READDIR_ON_OPEN=EMPTY_DIR | jq
{
"HEAD": {
"count": 1
},
"GET": {
"count": 2,
"bytes": 409600,
"ranges": [
"0-32767",
"32768-409599"
]
},
"Timing": 0.9715230464935303
}
$ tilebench profile https://noaa-eri-pds.s3.amazonaws.com/2022_Hurricane_Ian/20221002a_RGB/20221002aC0795145w325100n.tif --tile 15-9114-13215 --config GDAL_DISABLE_READDIR_ON_OPEN=FALSE | jq
{
"HEAD": {
"count": 8
},
"GET": {
"count": 3,
"bytes": 409600,
"ranges": [
"0-32767",
"32768-409599"
]
},
"Timing": 2.1837549209594727
}
Warning: This is highly experimental and should not be used in production (#6)
In addition of the viz
CLI we added a starlette middleware to easily integrate VSI statistics in your web services.
from fastapi import FastAPI
from tilebench.middleware import VSIStatsMiddleware
app = FastAPI()
app.add_middleware(VSIStatsMiddleware)
The middleware will add a vsi-stats
entry in the response headers
in form of:
vsi-stats: list;count=1, head;count=1, get;count=2;size=196608, ranges; values=0-65535|65536-196607
Some paths may be excluded from being handeld by the middleware by the exclude_paths
argument:
app.add_middleware(VSIStatsMiddleware, exclude_paths=["/foo", "/bar"])
- CPL_TIMESTAMP: Add timings on GDAL Logs
- GDAL_DISABLE_READDIR_ON_OPEN: Allow or Disable listing of files in the directory (e.g external overview)
- GDAL_INGESTED_BYTES_AT_OPEN: Control how many bytes GDAL will ingest when opening a dataset (useful when a file has a big header)
- CPL_VSIL_CURL_ALLOWED_EXTENSIONS: Limit valid external files
- GDAL_CACHEMAX: Cache size
- GDAL_HTTP_MERGE_CONSECUTIVE_RANGES
- VSI_CACHE
- VSI_CACHE_SIZE
See the full list at https://gdal.org/user/configoptions.html
$ tilebench viz https://noaa-eri-pds.s3.amazonaws.com/2022_Hurricane_Ian/20221002a_RGB/20221002aC0795145w325100n.tif --config GDAL_DISABLE_READDIR_ON_OPEN=EMPTY_DIR
Blue lines represent the mercator grid for a specific zoom level and the red lines represent the internal tiles bounds
We can then click on a mercator tile and see how much requests GDAL/RASTERIO does.
Ready to use docker image can be found on Github registry.
docker run \
--volume "$PWD":/data \
--platform linux/amd64 \
--rm -it -p 8080:8080 ghcr.io/developmentseed/tilebench:latest \
tilebench viz --host 0.0.0.0 https://noaa-eri-pds.s3.us-east-1.amazonaws.com/2020_Nashville_Tornado/20200307a_RGB/20200307aC0865700w360900n.tif
See CONTRIBUTING.md
See LICENSE
See contributors for a listing of individual contributors.
See CHANGES.md.