Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable retries by default #532

Merged
merged 3 commits into from
Jun 9, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
- Support for fetching and merging a selection of queryables [#511](https://github.com/stac-utils/pystac-client/pull/511)
- Better error messages for the CLI [#531](https://github.com/stac-utils/pystac-client/pull/531)
- `Modifiable` to our public API [#534](https://github.com/stac-utils/pystac-client/pull/534)
- `max_retries` parameter to `StacApiIO` [#532](https://github.com/stac-utils/pystac-client/pull/532)

### Changed

Expand Down
153 changes: 108 additions & 45 deletions docs/usage.rst
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,114 @@ there are no ``"conformsTo"`` uris set at all. But they can be explicitly set:
Note, updating ``"conformsTo"`` does not change what the server supports, it just
changes PySTAC client's understanding of what the server supports.

Configuring retry behavior
--------------------------

By default, **pystac-client** will retry requests that fail DNS lookup or have timeouts.
If you'd like to configure this behavior, e.g. to retry on some ``50x`` responses, you can configure the StacApiIO's session:

.. code-block:: python

from requests.adapters import HTTPAdapter
from urllib3 import Retry

from pystac_client import Client
from pystac_client.stac_api_io import StacApiIO

retry = Retry(total=5, backoff_factor=1, status_forcelist=[502, 503, 504])
stac_api_io = StacApiIO()
stac_api_io.session.mount("https://", HTTPAdapter(max_retries=retry))
client = Client.open(
"https://planetarycomputer.microsoft.com/api/stac/v1", stac_io=stac_api_io
)

Automatically modifying results
-------------------------------

Some systems, like the `Microsoft Planetary Computer <http://planetarycomputer.microsoft.com/>`__,
have public STAC metadata but require some `authentication <https://planetarycomputer.microsoft.com/docs/concepts/sas/>`__
to access the actual assets.

``pystac-client`` provides a ``modifier`` keyword that can automatically
modify the STAC objects returned by the STAC API.

.. code-block:: python

>>> from pystac_client import Client
>>> import planetary_computer, requests
>>> catalog = Client.open(
... 'https://planetarycomputer.microsoft.com/api/stac/v1',
... modifier=planetary_computer.sign_inplace,
... )
>>> item = next(catalog.get_collection("sentinel-2-l2a").get_all_items())
>>> requests.head(item.assets["B02"].href).status_code
200

Without the modifier, we would have received a 404 error because the asset
is in a private storage container.

``pystac-client`` expects that the ``modifier`` callable modifies the result
gadomski marked this conversation as resolved.
Show resolved Hide resolved
object in-place and returns no result. A warning is emitted if your
``modifier`` returns a non-None result that is not the same object as the
input.

Here's an example of creating your own modifier.
Because :py:class:`~pystac_client.Modifiable` is a union, the modifier function must handle a few different types of input objects, and care must be taken to ensure that you are modifying the input object (rather than a copy).
Simplifying this interface is a space for future improvement.

.. code-block:: python

import urllib.parse

import pystac

from pystac_client import Client, Modifiable


def modifier(modifiable: Modifiable) -> None:
if isinstance(modifiable, dict):
if modifiable["type"] == "FeatureCollection":
new_features = list()
for item_dict in modifiable["features"]:
modifier(item_dict)
new_features.append(item_dict)
modifiable["features"] = new_features
else:
stac_object = pystac.read_dict(modifiable)
modifier(stac_object)
modifiable.update(stac_object.to_dict())
else:
for key, asset in modifiable.assets.items():
url = urllib.parse.urlparse(asset.href)
if not url.query:
asset.href = urllib.parse.urlunparse(url._replace(query="foo=bar"))
modifiable.assets[key] = asset


client = Client.open(
"https://planetarycomputer.microsoft.com/api/stac/v1", modifier=modifier
)
item_search = client.search(collections=["landsat-c2-l2"], max_items=1)
item = next(item_search.items())
asset = item.assets["red"]
assert urllib.parse.urlparse(asset.href).query == "foo=bar"


Using custom certificates
-------------------------

If you need to use custom certificates in your ``pystac-client`` requests, you can
customize the :class:`StacApiIO<pystac_client.stac_api_io.StacApiIO>` instance before
creating your :class:`Client<pystac_client.Client>`.

.. code-block:: python

>>> from pystac_client.stac_api_io import StacApiIO
>>> from pystac_client.client import Client
>>> stac_api_io = StacApiIO()
>>> stac_api_io.session.verify = "/path/to/certfile"
>>> client = Client.open("https://planetarycomputer.microsoft.com/api/stac/v1", stac_io=stac_api_io)

CollectionClient
++++++++++++++++

Expand Down Expand Up @@ -307,51 +415,6 @@ descending sort and a ``+`` prefix or no prefix means an ascending sort.
]
... )

Automatically modifying results
-------------------------------

Some systems, like the `Microsoft Planetary Computer <http://planetarycomputer.microsoft.com/>`__,
have public STAC metadata but require some `authentication <https://planetarycomputer.microsoft.com/docs/concepts/sas/>`__
to access the actual assets.

``pystac-client`` provides a ``modifier`` keyword that can automatically
modify the STAC objects returned by the STAC API.

.. code-block:: python

>>> from pystac_client import Client
>>> import planetary_computer, requests
>>> catalog = Client.open(
... 'https://planetarycomputer.microsoft.com/api/stac/v1',
... modifier=planetary_computer.sign_inplace,
... )
>>> item = next(catalog.get_collection("sentinel-2-l2a").get_all_items())
>>> requests.head(item.assets["B02"].href).status_code
200

Without the modifier, we would have received a 404 error because the asset
is in a private storage container.

``pystac-client`` expects that the ``modifier`` callable modifies the result
object in-place and returns no result. A warning is emitted if your
``modifier`` returns a non-None result that is not the same object as the
input.

Using custom certificates
-------------------------

If you need to use custom certificates in your ``pystac-client`` requests, you can
customize the :class:`StacApiIO<pystac_client.stac_api_io.StacApiIO>` instance before
creating your :class:`Client<pystac_client.Client>`.

.. code-block:: python

>>> from pystac_client.stac_api_io import StacApiIO
>>> from pystac_client.client import Client
>>> stac_api_io = StacApiIO()
>>> stac_api_io.session.verify = "/path/to/certfile"
>>> client = Client.open("https://planetarycomputer.microsoft.com/api/stac/v1", stac_io=stac_api_io)

Loading data
++++++++++++

Expand Down
7 changes: 7 additions & 0 deletions pystac_client/stac_api_io.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@
)
from pystac.stac_io import DefaultStacIO
from requests import Request, Session
from requests.adapters import HTTPAdapter
from typing_extensions import TypeAlias

import pystac_client
Expand All @@ -49,6 +50,7 @@ def __init__(
parameters: Optional[Dict[str, Any]] = None,
request_modifier: Optional[Callable[[Request], Union[Request, None]]] = None,
timeout: Timeout = None,
max_retries: Optional[int] = 5,
):
"""Initialize class for API IO

Expand All @@ -69,6 +71,8 @@ def __init__(
timeout: Optional float or (float, float) tuple following the semantics
defined by `Requests
<https://requests.readthedocs.io/en/latest/api/#main-interface>`__.
max_retries: The number of times to retry requests. Set to ``None`` to
disable retries.

Return:
StacApiIO : StacApiIO instance
Expand All @@ -87,6 +91,9 @@ def __init__(
)

self.session = Session()
if max_retries:
self.session.mount("http://", HTTPAdapter(max_retries=max_retries))
self.session.mount("https://", HTTPAdapter(max_retries=max_retries))
self.timeout = timeout
self.update(
headers=headers, parameters=parameters, request_modifier=request_modifier
Expand Down