Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support not publicly trusted certificates in built-in component catalog connectors #2912

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions docs/source/user_guide/pipeline-components.md
Original file line number Diff line number Diff line change
Expand Up @@ -397,6 +397,7 @@ The URL component catalog connector provides access to components that are store
- You can specify one or more URL resources.
- The specified URLs must be retrievable using an HTTP `GET` request. `http`, `https`, and `file` [URI schemes](https://www.iana.org/assignments/uri-schemes/uri-schemes.xhtml) are supported.
- If the resources are secured, provide credentials, such as a user id and password or API key.
- In secured environments where SSL server authenticity can only be validated using certificates based on private public key infrastructure (PKI) with root and optionally intermediate certificate authorities (CAs) that are not publicly trusted, you must define environment variable `TRUSTED_CA_BUNDLE_PATH` in the environment where JupyterLab/Elyra is running. The variable value must identify an existing [Privacy-Enhanced Mail (PEM) file](https://en.wikipedia.org/wiki/Privacy-Enhanced_Mail).
ptitzler marked this conversation as resolved.
Show resolved Hide resolved

Examples (GUI):
- HTTPS URL
Expand Down Expand Up @@ -428,6 +429,7 @@ Examples (CLI):
The [Apache Airflow package catalog connector](https://github.com/elyra-ai/elyra/tree/main/elyra/pipeline/airflow/package_catalog_connector) provides access to operators that are stored in Apache Airflow [built distributions](https://packaging.python.org/en/latest/glossary/#term-built-distribution):
- Only the [wheel distribution format](https://packaging.python.org/en/latest/glossary/#term-Wheel) is supported.
- The specified URL must be retrievable using an HTTP `GET` request. `http`, `https`, and `file` [URI schemes](https://www.iana.org/assignments/uri-schemes/uri-schemes.xhtml) are supported.
- In secured environments where SSL server authenticity can only be validated using certificates based on private public key infrastructure (PKI) with root and optionally intermediate certificate authorities (CAs) that are not publicly trusted, you must define environment variable `TRUSTED_CA_BUNDLE_PATH` in the environment where JupyterLab/Elyra is running. The variable value must identify an existing [Privacy-Enhanced Mail (PEM) file](https://en.wikipedia.org/wiki/Privacy-Enhanced_Mail).

Examples:
- [Apache Airflow](https://pypi.org/project/apache-airflow/) (v1.10.15):
Expand All @@ -443,6 +445,7 @@ Examples:
The [Apache Airflow provider package catalog connector](https://github.com/elyra-ai/elyra/tree/main/elyra/pipeline/airflow/provider_package_catalog_connector) provides access to operators that are stored in [Apache Airflow provider packages](https://airflow.apache.org/docs/apache-airflow-providers/):
- Only the [wheel distribution format](https://packaging.python.org/en/latest/glossary/#term-Wheel) is supported.
- The specified URL must be retrievable using an HTTP `GET` request. `http`, `https`, and `file` [URI schemes](https://www.iana.org/assignments/uri-schemes/uri-schemes.xhtml) are supported.
- In secured environments where SSL server authenticity can only be validated using certificates based on private public key infrastructure (PKI) with root and optionally intermediate certificate authorities (CAs) that are not publicly trusted, you must define environment variable `TRUSTED_CA_BUNDLE_PATH` in the environment where JupyterLab/Elyra is running. The variable value must identify an existing [Privacy-Enhanced Mail (PEM) file](https://en.wikipedia.org/wiki/Privacy-Enhanced_Mail).

Examples:
- [apache-airflow-providers-http](https://airflow.apache.org/docs/apache-airflow-providers-http/stable/index.html) (v2.0.2):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@
from elyra.pipeline.catalog_connector import ComponentCatalogConnector
from elyra.pipeline.catalog_connector import EntryData
from elyra.util.url import FileTransportAdapter
from elyra.util.url import get_verify_parm


class AirflowPackageCatalogConnector(ComponentCatalogConnector):
Expand Down Expand Up @@ -111,6 +112,7 @@ def get_catalog_entries(self, catalog_metadata: Dict[str, Any]) -> List[Dict[str
timeout=AirflowPackageCatalogConnector.REQUEST_TIMEOUT,
allow_redirects=True,
auth=auth,
verify=get_verify_parm(),
ptitzler marked this conversation as resolved.
Show resolved Hide resolved
)
except Exception as ex:
self.log.error(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@
from elyra.pipeline.catalog_connector import ComponentCatalogConnector
from elyra.pipeline.catalog_connector import EntryData
from elyra.util.url import FileTransportAdapter
from elyra.util.url import get_verify_parm


class AirflowProviderPackageCatalogConnector(ComponentCatalogConnector):
Expand Down Expand Up @@ -116,6 +117,7 @@ def get_catalog_entries(self, catalog_metadata: Dict[str, Any]) -> List[Dict[str
timeout=AirflowProviderPackageCatalogConnector.REQUEST_TIMEOUT,
allow_redirects=True,
auth=auth,
verify=get_verify_parm(),
)
except Exception as ex:
self.log.error(
Expand Down
2 changes: 2 additions & 0 deletions elyra/pipeline/catalog_connector.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@
from elyra.pipeline.component import ComponentParameter
from elyra.pipeline.runtime_type import RuntimeProcessorType
from elyra.util.url import FileTransportAdapter
from elyra.util.url import get_verify_parm


class EntryData(object):
Expand Down Expand Up @@ -664,6 +665,7 @@ def get_entry_data(
timeout=UrlComponentCatalogConnector.REQUEST_TIMEOUT,
allow_redirects=True,
auth=auth,
verify=get_verify_parm(),
)
except Exception as e:
self.log.error(
Expand Down
51 changes: 51 additions & 0 deletions elyra/tests/util/test_url.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,11 +13,14 @@
# See the License for the specific language governing permissions and
# limitations under the License.
#
import os
from pathlib import Path

import pytest
from requests import session

from elyra.util.url import FileTransportAdapter
from elyra.util.url import get_verify_parm


def test_valid_file_url():
Expand Down Expand Up @@ -78,3 +81,51 @@ def test_invalid_file_url():
res = unsupported_method(url)
assert res.status_code == 405, url
assert res.reason == "Method not allowed"


@pytest.fixture
def setup_env_vars():
# runs before test (save the value of environment variable
# TRUSTED_CA_BUNDLE_PATH to avoid any contamination by tests
# that modify it)
current_TRUSTED_CA_BUNDLE_PATH_value = os.environ.get("TRUSTED_CA_BUNDLE_PATH")
yield
# runs after test (restore the value of environment variable
# TRUSTED_CA_BUNDLE_PATH, if it was defined)
if current_TRUSTED_CA_BUNDLE_PATH_value is not None:
os.environ["TRUSTED_CA_BUNDLE_PATH"] = current_TRUSTED_CA_BUNDLE_PATH_value


@pytest.mark.usefixtures("setup_env_vars")
def test_valid_get_verify_parm():
"""
Verify that method get_verify_parm works as expected:
- env variable TRUSTED_CA_BUNDLE_PATH is defined
- env variable TRUSTED_CA_BUNDLE_PATH is not defined, but a default is specified
- env variable TRUSTED_CA_BUNDLE_PATH is not defined and no default is specified
"""
test_TRUSTED_CA_BUNDLE_PATH_value = "/path/to/cert/bundle"
os.environ["TRUSTED_CA_BUNDLE_PATH"] = test_TRUSTED_CA_BUNDLE_PATH_value
assert get_verify_parm() == test_TRUSTED_CA_BUNDLE_PATH_value
del os.environ["TRUSTED_CA_BUNDLE_PATH"]
# set explicit default
assert get_verify_parm(False) is False
# set explicit default
assert get_verify_parm(True) is True
# use implicit default
assert get_verify_parm() is True


@pytest.mark.usefixtures("setup_env_vars")
def test_invalid_get_verify_parm():
"""
Verify that method get_verify_parm works as if environment variable
TRUSTED_CA_BUNDLE_PATH contains an invalid value
"""
test_TRUSTED_CA_BUNDLE_PATH_value = ""
os.environ["TRUSTED_CA_BUNDLE_PATH"] = test_TRUSTED_CA_BUNDLE_PATH_value
assert get_verify_parm() is True

test_TRUSTED_CA_BUNDLE_PATH_value = " "
os.environ["TRUSTED_CA_BUNDLE_PATH"] = test_TRUSTED_CA_BUNDLE_PATH_value
assert get_verify_parm() is True
15 changes: 15 additions & 0 deletions elyra/util/url.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,9 @@
# See the License for the specific language governing permissions and
# limitations under the License.
#
import os
from pathlib import Path
from typing import Union
from urllib.request import url2pathname

from requests import Response
Expand Down Expand Up @@ -63,3 +65,16 @@ def send(self, req, **kwargs):

def close(self):
pass


def get_verify_parm(default: bool = True) -> Union[bool, str]:
ptitzler marked this conversation as resolved.
Show resolved Hide resolved
"""
Returns a value for the 'verify' parameter of the requests.request
method (https://requests.readthedocs.io/en/latest/api/). The value
is determined as follows: if environment variable TRUSTED_CA_BUNDLE_PATH
is defined, use its value, otherwise return the default value.
"""
if len(os.environ.get("TRUSTED_CA_BUNDLE_PATH", "").strip()) > 0:
ptitzler marked this conversation as resolved.
Show resolved Hide resolved
return os.environ.get("TRUSTED_CA_BUNDLE_PATH")

return default