Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: hostname port issue #4816

Merged
merged 4 commits into from
Feb 2, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 12 additions & 6 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,12 +90,18 @@ it will be removed; but as it won't be user-visible this isn't considered a brea

### Emissary-ingress and Ambassador Edge Stack

- Change: emissary-ingress can be slow to start when there is a lot of mappings. Kubernetes
recommends using a startupProbe over tweaking the initialDelaySeconds of the livenessProbe and
readinessProbe. The Helm chart was updated to allow setting a startupProbe on the emissary-ingress
deployment. ([#4649])

[#4649]: https://github.com/emissary-ingress/emissary/pull/4649
- Bugfix: When wanting to expose traffic to clients on ports other than 80/443, users will set a
port in the Host.hostname (eg.`Host.hostname=example.com:8500`. The config generated allowed
matching on the :authority header. This worked in v1.Y series due to the way emissary was
generating Envoy configuration under a single wild-card virtual_host and matching on
:authority.

In v2.Y/v3.Y+, the way emissary generates Envoy configuration changed to address
memory pressure and improve route lookup speed in Envoy. However, when including a port in the
hostname, an incorrect configuration was generated with an sni match including the port. This has
been fixed and the correct envoy configuration is being generated. ([fix: hostname port issue])

[fix: hostname port issue]: https://github.com/emissary-ingress/emissary/pull/4816

## [3.4.0] January 03, 2023
[3.4.0]: https://github.com/emissary-ingress/emissary/compare/v3.3.0...v3.4.0
Expand Down
4 changes: 3 additions & 1 deletion charts/emissary-ingress/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,9 @@ numbering uses [semantic versioning](http://semver.org).
## v8.5.0 - TBD

- Upgrade Emissary to v3.5.0 [CHANGELOG](https://github.com/emissary-ingress/emissary/blob/master/CHANGELOG.md)
- Adds opt-in settings to configure a startupProbe for the Emissary-ingress pod
- Adds support for configuring a startupProbe on the emissary-ingress deployments. This is useful when a larger number of resources
are used and the initial startup needs additional time. See for details https://github.com/emissary-ingress/emissary/pull/4649.


## v8.4.0 - 2022-01-03

Expand Down
23 changes: 15 additions & 8 deletions docs/releaseNotes.yml
Original file line number Diff line number Diff line change
Expand Up @@ -36,16 +36,23 @@ items:
prevVersion: 3.4.0
date: 'TBD'
notes:
- title: Add support for startupProbe on the emissary-ingress deployments
type: change
- title: Fix envoy config generation when including port in Host.hostname
Alice-Lilith marked this conversation as resolved.
Show resolved Hide resolved
type: bugfix
body: >-
emissary-ingress can be slow to start when there is a lot of mappings. Kubernetes
recommends using a startupProbe over tweaking the initialDelaySeconds of the livenessProbe
and readinessProbe. The Helm chart was updated to allow setting a startupProbe on the
emissary-ingress deployment.
When wanting to expose traffic to clients on ports other than 80/443, users will set
a port in the Host.hostname (eg.<code>Host.hostname=example.com:8500</code>. The config
generated allowed matching on the :authority header. This worked in v1.Y series due to the
way emissary was generating Envoy configuration under a single wild-card virtual_host and matching
on :authority.


In v2.Y/v3.Y+, the way emissary generates Envoy configuration changed to address memory pressure and improve
route lookup speed in Envoy. However, when including a port in the hostname, an incorrect configuration was
generated with an sni match including the port. This has been fixed and the correct envoy configuration is
being generated.
github:
- title: "#4649"
link: https://github.com/emissary-ingress/emissary/pull/4649
- title: "fix: hostname port issue"
link: https://github.com/emissary-ingress/emissary/pull/4816

- version: 3.4.0
prevVersion: 3.3.0
Expand Down
200 changes: 123 additions & 77 deletions python/ambassador/envoy/v3/v3listener.py

Large diffs are not rendered by default.

73 changes: 37 additions & 36 deletions python/ambassador/ir/irhost.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ class IRHost(IRResource):
}

hostname: str
sni: str
secure_action: str
insecure_action: str
insecure_addl_port: Optional[int]
Expand Down Expand Up @@ -64,6 +65,8 @@ def setup(self, ir: "IR", aconf: Config) -> bool:
if not self.get("hostname", None):
self.hostname = "*"

self.sni = self.hostname.rsplit(":", 1)[0]

tls_ss: Optional[SavedSecret] = None
pkey_ss: Optional[SavedSecret] = None

Expand Down Expand Up @@ -204,7 +207,7 @@ def setup(self, ir: "IR", aconf: Config) -> bool:
name=ctx_name,
namespace=self.namespace,
location=self.location,
hosts=[self.hostname or self.name],
hosts=[self.hostname],
secret=tls_full_name,
)

Expand Down Expand Up @@ -256,7 +259,7 @@ def setup(self, ir: "IR", aconf: Config) -> bool:
name=ctx_name,
namespace=self.namespace,
location=self.location,
hosts=[self.hostname or self.name],
hosts=[self.hostname],
secret=tls_full_name,
)

Expand Down Expand Up @@ -391,17 +394,17 @@ def save_context(self, ir: "IR", ctx_name: str, tls_ss: SavedSecret, tls_name: s
# TLS config is good, let's make sure the hosts line up too.
context_hosts = ctx.get("hosts")

# XXX WTF? self.name is not OK as a hostname! Leaving this for the moment, but it's
# almost certainly getting shredded before 2.0 GAs.
host_hosts = [self.name]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the ramifications of getting rid of name? Basically why can we do it now and it wasn't done before?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as we discussed offline, I dug into this deeper and although we don't document this anywhere there is a potential that a user could ignore our docs and intpret the TLSContext.hosts as needing to match the name of a Host.name.

Although, it is unlikely and it is not our intended behavior,I will add it back since it is not crucial to the fix for this PR. I will capture it along with the other deprecated TLS features that I outlined and we can plan for its removal in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the overall issue on this topic:
#4826

I make a mention to this and that it is deprecated. I will have a fixup commit here shortly that does remove some dead code where we check self.hostname or self.name because we default self.hostname to "*" if user doesn't provide it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FTR I'm the one who wrote that comment. self.name is not guaranteed to be something that's actually sane for a DNS match, IIRC, so I'm in favor of it going away. However, it used to be important that host_hosts never be empty; might be worth a test.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Newer folks, if you see a comment that starts with XXX, it's almost certainly something I wrote. 🙂)


if self.hostname:
host_hosts.append(self.hostname)
# Technically a user can set the TLSContext.hosts to include a Host.name thus allowing it to pass
# the validation and saved as the context for this Host. Although, this is not intended
# or documented behavior it, removing it could break users so we need to mark it as deprecated.
# @deprecated - validating TLSContext.hosts against Host.name will be removed, use hostname for matching instead.
host_hosts = [self.hostname, self.name]

if context_hosts:
is_valid_hosts = False

# XXX Should we be doing a glob check here?
# we exact match here, which requires users being explicit about whether a TLSContext can attach
# to a Host. Note: this does not do hostname glob matching.
for host_tc in context_hosts:
if host_tc in host_hosts:
is_valid_hosts = True
Expand All @@ -413,8 +416,7 @@ def save_context(self, ir: "IR", ctx_name: str, tls_ss: SavedSecret, tls_name: s
)
# XXX Shouldn't we return false here?
else:
# XXX WTF? self.name is not OK as a hostname!
ctx["hosts"] = [self.hostname or self.name]
ctx["hosts"] = [self.hostname]

self.logger.debug(f"Host {self.name}, final ctx {ctx.name}: {ctx.as_json()}")

Expand All @@ -435,6 +437,17 @@ def matches_httpgroup(self, group: "IRHTTPMappingGroup") -> bool:
mappingSelector.
"""

groupName = group.get("name") or "None"

# The synthetic Mappings for diagnostics, readiness, and liveness probes always match all Hosts.
# They can all still be disabled if desired via the Ambassador Module resource
if groupName in [
"GROUP: internal_readiness_probe_mapping",
"GROUP: internal_liveness_probe_mapping",
"GROUP: internal_diagnostics_probe_mapping",
]:
return True

has_hostname = False
host_match = False
sel_match = False
Expand Down Expand Up @@ -474,34 +487,22 @@ def matches_httpgroup(self, group: "IRHTTPMappingGroup") -> bool:
sel_match,
)

groupName = group.get("name") or "None"

# The synthetic Mappings for diagnostics, readiness, and liveness probes always match all Hosts.
# They can all still be disabled if desired via the Ambassador Module resource
if groupName in [
"GROUP: internal_readiness_probe_mapping",
"GROUP: internal_liveness_probe_mapping",
"GROUP: internal_diagnostics_probe_mapping",
]:
return True

if disable_strict_selectors():
# User opted-in to existing deprecated behavior (not-recommmended)
return host_match or sel_match

if mapsel:
if has_hostname:
# If both mappingSelector and hostname are present, then they must both be a match
return host_match and sel_match
else:
# If the Mapping does not provide a hostname, then only the mappingSelector must match
return sel_match
elif mapsel and has_hostname:
# If both mappingSelector and hostname are present, then they must both be a match
return host_match and sel_match
elif mapsel:
# If the Mapping does not provide a hostname, then only the mappingSelector must match
return sel_match

elif has_hostname:
# If there is no mappingSelector then it must only match the provided hostname
return host_match
else:
if has_hostname:
# If there is no mappingSelector then it must only match the provided hostname
return host_match
else:
# If there is no mappingSelector or hostname then it cannot match anything
return False
# If there is no mappingSelector or hostname then it cannot match anything
return False

def __str__(self) -> str:
request_policy = self.get("requestPolicy", {})
Expand Down
9 changes: 9 additions & 0 deletions python/tests/unit/test_hostglob_matches.py
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,15 @@ def test_hostglob_matches():
("*.example.com", "a.b.example.*", True),
("*.example.baz.com", "a.b.example.*", True),
("*.foo.bar", "baz.zing.*", True),
# The Host.hostname is overloaded in that it determines the SNI for TLS and the
# virtual host name for :authority header matching for HTTP. These are valid
# scenarios that users try when using non-standard ports so we make sure they work.
("*.local:8500", "quote.local", False),
("*.local:8500", "quote.local:8500", True),
("*", "quote.local:8500", True),
("quote.*", "quote.local:8500", True),
("quote.*", "*.local:8500", True),
("quote.com:8500", "quote.com:8500", True),
]:
assert hostglob_matches(v1, v2) == wanted_result, f"1. {v1} ~ {v2} != {wanted_result}"
assert hostglob_matches(v2, v1) == wanted_result, f"2. {v2} ~ {v1} != {wanted_result}"
Expand Down
99 changes: 98 additions & 1 deletion python/tests/unit/test_listener.py
Original file line number Diff line number Diff line change
@@ -1,15 +1,23 @@
import os
from dataclasses import dataclass
from typing import List, Optional

import pytest
import yaml

from ambassador import IR
from ambassador.compile import Compile
from ambassador.config import Config
from ambassador.envoy import EnvoyConfig
from ambassador.fetch import ResourceFetcher
from ambassador.utils import EmptySecretHandler
from tests.utils import Compile, default_http3_listener_manifest, econf_compile, logger
from tests.utils import (
Compile,
default_http3_listener_manifest,
econf_compile,
logger,
skip_edgestack,
)


def _ensure_alt_svc_header_injected(listener, expectedAltSvc):
Expand Down Expand Up @@ -316,3 +324,92 @@ def test_http3_non_matching_ports(self):
assert len(udp_filter_chains) == 1

_verify_no_added_response_headers(udp_listener)

@skip_edgestack()
@pytest.mark.compilertest
def test_listener_filterchain_vhost_generation(self):
"""Ensure that the Listener FilterChain and correct vhosts are generated based on the
provided Listener, Host and Mappings to ensure mutliple scenarios are covered such as
clients sending hostname:port and addressing h2/h3 connection re-use on parent domains
"""

def _cleanse_secrets(listeners: dict):
"""
For these tests the full path of the secret is not what is being tested. This will
mutate the listeners and remove secret paths and replace with static values so that when
diffing against expected results they don't cause variance in tests.
"""

for listIndex, listener in enumerate(listeners):
for fcIndex, fc in enumerate(listener["filter_chains"]):
certs = (
fc.get("transport_socket", {})
.get("typed_config", {})
.get("common_tls_context", {})
.get("tls_certificates", {})
)
for i, cert in enumerate(certs):
if "filename" in cert.get("certificate_chain", {}):
cert["certificate_chain"]["filename"] = "test.crt"
if "filename" in cert.get("private_key", {}):
cert["private_key"]["filename"] = "test.key"
certs[i] = cert
listeners[listIndex]["filter_chains"][fcIndex] = fc

testdata_dir = os.path.join(
os.path.dirname(os.path.abspath(__file__)), "testdata", "listeners"
)

@dataclass
class TestCase:
# name should match the file name in testdata/listeners (excluding the _out.yaml and _in.yaml suffixes)
name: str
# description allows developer to provide my information on use-case without having to study the in/out files
description: str

testcases = [
TestCase(
name="host_missing_tls",
description="""
When applying the quick-start of 8080 and 8443 with a Host that doesn't have tls configured it will generate basically
the same FilterChain/Vhost/Routes between the two listeners. The HTTPS Listener will not generate a
Filter Chain checking for matches on TLS and/or SNI. A fallback cert is not used because a Host is configured
although arguable incorrectly. Ideally, the HTTPS listener (8443) should not attach anything or use the fallback
cert as-if no Host was provided but since this was existing behavior it has been left that way for now.
""",
),
TestCase(
name="no_host",
description="""
If no Host is provided the http (8080) listener will create the normal "shared http" filter chain
and the https (8443) listener will generate two filter chains; a "shared http" filter chain to catch non-tls traffic
and a filter chain matching on tls and using the fall-back cert with NO sni matching.
""",
),
TestCase(
name="prefix_wildcard_and_hostname_with_port",
description="""
Properly handle Host with prefix wildcards (*.local) and hosts with portnames (*.local:8500). In
this scenario both host will be coalesced into the same FilterChain due to matching SNI but
will get their own virtual hosts due to needing to match on a :authority header. Mappings will
associate to the most specific vhost based on the Host.hostname and Mapping.hostname fields
""",
),
]
for case in testcases:

applied_yaml = open(os.path.join(testdata_dir, f"{case.name}_in.yaml"), "r").read()
expected = yaml.safe_load(
open(os.path.join(testdata_dir, f"{case.name}_out.yaml"), "r")
)

econf = econf_compile(applied_yaml)
assert econf

expListeners = expected.get("listeners", {})
assert expListeners != {}

listeners = econf.get("static_resources", {}).get("listeners", [])
_cleanse_secrets(listeners)

assert expListeners == listeners
44 changes: 44 additions & 0 deletions python/tests/unit/testdata/listeners/host_missing_tls_in.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
---
apiVersion: getambassador.io/v3alpha1
kind: Listener
metadata:
name: listener-8080
namespace: ambassador
spec:
port: 8080
protocol: HTTP
securityModel: XFP
hostBinding:
namespace:
from: ALL
---
apiVersion: getambassador.io/v3alpha1
kind: Listener
metadata:
name: listener-8443
namespace: ambassador
spec:
port: 8443
protocol: HTTPS
securityModel: XFP
hostBinding:
namespace:
from: ALL
---
apiVersion: getambassador.io/v3alpha1
kind: Host
metadata:
name: minimal-host
namespace: default
spec:
hostname: '*.local:8500'
---
apiVersion: getambassador.io/v3alpha1
kind: Mapping
metadata:
name: quote-backend
namespace: default
spec:
hostname: "quote.local:8500"
prefix: /backend/
service: quote
Loading