Skip to content

Commit

Permalink
Support not publicly trusted certificates in built-in component catal…
Browse files Browse the repository at this point in the history
…og connectors (#2912)

Add support for SSL server authenticity validation using certificates based on private public key infrastructure with root and optionally intermediate certificate authorities that are not publicly trusted.
  • Loading branch information
ptitzler authored Sep 12, 2022
1 parent 1da381a commit c77c2f7
Show file tree
Hide file tree
Showing 6 changed files with 75 additions and 0 deletions.
3 changes: 3 additions & 0 deletions docs/source/user_guide/pipeline-components.md
Original file line number Diff line number Diff line change
Expand Up @@ -397,6 +397,7 @@ The URL component catalog connector provides access to components that are store
- You can specify one or more URL resources.
- The specified URLs must be retrievable using an HTTP `GET` request. `http`, `https`, and `file` [URI schemes](https://www.iana.org/assignments/uri-schemes/uri-schemes.xhtml) are supported.
- If the resources are secured, provide credentials, such as a user id and password or API key.
- In secured environments where SSL server authenticity can only be validated using certificates based on private public key infrastructure (PKI) with root and optionally intermediate certificate authorities (CAs) that are not publicly trusted, you must define environment variable `TRUSTED_CA_BUNDLE_PATH` in the environment where JupyterLab/Elyra is running. The variable value must identify an existing [Privacy-Enhanced Mail (PEM) file](https://en.wikipedia.org/wiki/Privacy-Enhanced_Mail).

Examples (GUI):
- HTTPS URL
Expand Down Expand Up @@ -428,6 +429,7 @@ Examples (CLI):
The [Apache Airflow package catalog connector](https://github.com/elyra-ai/elyra/tree/main/elyra/pipeline/airflow/package_catalog_connector) provides access to operators that are stored in Apache Airflow [built distributions](https://packaging.python.org/en/latest/glossary/#term-built-distribution):
- Only the [wheel distribution format](https://packaging.python.org/en/latest/glossary/#term-Wheel) is supported.
- The specified URL must be retrievable using an HTTP `GET` request. `http`, `https`, and `file` [URI schemes](https://www.iana.org/assignments/uri-schemes/uri-schemes.xhtml) are supported.
- In secured environments where SSL server authenticity can only be validated using certificates based on private public key infrastructure (PKI) with root and optionally intermediate certificate authorities (CAs) that are not publicly trusted, you must define environment variable `TRUSTED_CA_BUNDLE_PATH` in the environment where JupyterLab/Elyra is running. The variable value must identify an existing [Privacy-Enhanced Mail (PEM) file](https://en.wikipedia.org/wiki/Privacy-Enhanced_Mail).

Examples:
- [Apache Airflow](https://pypi.org/project/apache-airflow/) (v1.10.15):
Expand All @@ -443,6 +445,7 @@ Examples:
The [Apache Airflow provider package catalog connector](https://github.com/elyra-ai/elyra/tree/main/elyra/pipeline/airflow/provider_package_catalog_connector) provides access to operators that are stored in [Apache Airflow provider packages](https://airflow.apache.org/docs/apache-airflow-providers/):
- Only the [wheel distribution format](https://packaging.python.org/en/latest/glossary/#term-Wheel) is supported.
- The specified URL must be retrievable using an HTTP `GET` request. `http`, `https`, and `file` [URI schemes](https://www.iana.org/assignments/uri-schemes/uri-schemes.xhtml) are supported.
- In secured environments where SSL server authenticity can only be validated using certificates based on private public key infrastructure (PKI) with root and optionally intermediate certificate authorities (CAs) that are not publicly trusted, you must define environment variable `TRUSTED_CA_BUNDLE_PATH` in the environment where JupyterLab/Elyra is running. The variable value must identify an existing [Privacy-Enhanced Mail (PEM) file](https://en.wikipedia.org/wiki/Privacy-Enhanced_Mail).

Examples:
- [apache-airflow-providers-http](https://airflow.apache.org/docs/apache-airflow-providers-http/stable/index.html) (v2.0.2):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@
from elyra.pipeline.catalog_connector import ComponentCatalogConnector
from elyra.pipeline.catalog_connector import EntryData
from elyra.util.url import FileTransportAdapter
from elyra.util.url import get_verify_parm


class AirflowPackageCatalogConnector(ComponentCatalogConnector):
Expand Down Expand Up @@ -111,6 +112,7 @@ def get_catalog_entries(self, catalog_metadata: Dict[str, Any]) -> List[Dict[str
timeout=AirflowPackageCatalogConnector.REQUEST_TIMEOUT,
allow_redirects=True,
auth=auth,
verify=get_verify_parm(),
)
except Exception as ex:
self.log.error(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@
from elyra.pipeline.catalog_connector import ComponentCatalogConnector
from elyra.pipeline.catalog_connector import EntryData
from elyra.util.url import FileTransportAdapter
from elyra.util.url import get_verify_parm


class AirflowProviderPackageCatalogConnector(ComponentCatalogConnector):
Expand Down Expand Up @@ -116,6 +117,7 @@ def get_catalog_entries(self, catalog_metadata: Dict[str, Any]) -> List[Dict[str
timeout=AirflowProviderPackageCatalogConnector.REQUEST_TIMEOUT,
allow_redirects=True,
auth=auth,
verify=get_verify_parm(),
)
except Exception as ex:
self.log.error(
Expand Down
2 changes: 2 additions & 0 deletions elyra/pipeline/catalog_connector.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@
from elyra.pipeline.component import ComponentParameter
from elyra.pipeline.runtime_type import RuntimeProcessorType
from elyra.util.url import FileTransportAdapter
from elyra.util.url import get_verify_parm


class EntryData(object):
Expand Down Expand Up @@ -664,6 +665,7 @@ def get_entry_data(
timeout=UrlComponentCatalogConnector.REQUEST_TIMEOUT,
allow_redirects=True,
auth=auth,
verify=get_verify_parm(),
)
except Exception as e:
self.log.error(
Expand Down
51 changes: 51 additions & 0 deletions elyra/tests/util/test_url.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,11 +13,14 @@
# See the License for the specific language governing permissions and
# limitations under the License.
#
import os
from pathlib import Path

import pytest
from requests import session

from elyra.util.url import FileTransportAdapter
from elyra.util.url import get_verify_parm


def test_valid_file_url():
Expand Down Expand Up @@ -78,3 +81,51 @@ def test_invalid_file_url():
res = unsupported_method(url)
assert res.status_code == 405, url
assert res.reason == "Method not allowed"


@pytest.fixture
def setup_env_vars():
# runs before test (save the value of environment variable
# TRUSTED_CA_BUNDLE_PATH to avoid any contamination by tests
# that modify it)
current_TRUSTED_CA_BUNDLE_PATH_value = os.environ.get("TRUSTED_CA_BUNDLE_PATH")
yield
# runs after test (restore the value of environment variable
# TRUSTED_CA_BUNDLE_PATH, if it was defined)
if current_TRUSTED_CA_BUNDLE_PATH_value is not None:
os.environ["TRUSTED_CA_BUNDLE_PATH"] = current_TRUSTED_CA_BUNDLE_PATH_value


@pytest.mark.usefixtures("setup_env_vars")
def test_valid_get_verify_parm():
"""
Verify that method get_verify_parm works as expected:
- env variable TRUSTED_CA_BUNDLE_PATH is defined
- env variable TRUSTED_CA_BUNDLE_PATH is not defined, but a default is specified
- env variable TRUSTED_CA_BUNDLE_PATH is not defined and no default is specified
"""
test_TRUSTED_CA_BUNDLE_PATH_value = "/path/to/cert/bundle"
os.environ["TRUSTED_CA_BUNDLE_PATH"] = test_TRUSTED_CA_BUNDLE_PATH_value
assert get_verify_parm() == test_TRUSTED_CA_BUNDLE_PATH_value
del os.environ["TRUSTED_CA_BUNDLE_PATH"]
# set explicit default
assert get_verify_parm(False) is False
# set explicit default
assert get_verify_parm(True) is True
# use implicit default
assert get_verify_parm() is True


@pytest.mark.usefixtures("setup_env_vars")
def test_invalid_get_verify_parm():
"""
Verify that method get_verify_parm works as if environment variable
TRUSTED_CA_BUNDLE_PATH contains an invalid value
"""
test_TRUSTED_CA_BUNDLE_PATH_value = ""
os.environ["TRUSTED_CA_BUNDLE_PATH"] = test_TRUSTED_CA_BUNDLE_PATH_value
assert get_verify_parm() is True

test_TRUSTED_CA_BUNDLE_PATH_value = " "
os.environ["TRUSTED_CA_BUNDLE_PATH"] = test_TRUSTED_CA_BUNDLE_PATH_value
assert get_verify_parm() is True
15 changes: 15 additions & 0 deletions elyra/util/url.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,9 @@
# See the License for the specific language governing permissions and
# limitations under the License.
#
import os
from pathlib import Path
from typing import Union
from urllib.request import url2pathname

from requests import Response
Expand Down Expand Up @@ -63,3 +65,16 @@ def send(self, req, **kwargs):

def close(self):
pass


def get_verify_parm(default: bool = True) -> Union[bool, str]:
"""
Returns a value for the 'verify' parameter of the requests.request
method (https://requests.readthedocs.io/en/latest/api/). The value
is determined as follows: if environment variable TRUSTED_CA_BUNDLE_PATH
is defined, use its value, otherwise return the default value.
"""
if len(os.environ.get("TRUSTED_CA_BUNDLE_PATH", "").strip()) > 0:
return os.environ.get("TRUSTED_CA_BUNDLE_PATH")

return default

0 comments on commit c77c2f7

Please sign in to comment.