-
Notifications
You must be signed in to change notification settings - Fork 335
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(*): localhost exposed application shouldn't be reachable #4654
Conversation
Signed-off-by: Łukasz Dziedziak <lukidzi@gmail.com>
Signed-off-by: Łukasz Dziedziak <lukidzi@gmail.com>
…ix/localhost-expose
Codecov Report
@@ Coverage Diff @@
## master #4654 +/- ##
========================================
Coverage 46.43% 46.44%
========================================
Files 687 688 +1
Lines 46710 46847 +137
========================================
+ Hits 21692 21758 +66
- Misses 23097 23161 +64
- Partials 1921 1928 +7
Continue to review full report at Codecov.
|
Signed-off-by: Łukasz Dziedziak <lukidzi@gmail.com>
Signed-off-by: Łukasz Dziedziak <lukidzi@gmail.com>
Signed-off-by: Łukasz Dziedziak <lukidzi@gmail.com>
Signed-off-by: Łukasz Dziedziak <lukidzi@gmail.com>
Signed-off-by: Łukasz Dziedziak <lukidzi@gmail.com>
…ix/localhost-expose
Signed-off-by: Łukasz Dziedziak <lukidzi@gmail.com>
…ix/localhost-expose
Signed-off-by: Łukasz Dziedziak <lukidzi@gmail.com>
Signed-off-by: Łukasz Dziedziak <lukidzi@gmail.com>
Signed-off-by: Łukasz Dziedziak <lukidzi@gmail.com>
Signed-off-by: Łukasz Dziedziak <lukidzi@gmail.com>
Signed-off-by: Łukasz Dziedziak <lukidzi@gmail.com>
I need to add some tests but the basic logic should be ready. |
Signed-off-by: Łukasz Dziedziak <lukidzi@gmail.com>
Signed-off-by: Łukasz Dziedziak <lukidzi@gmail.com>
Signed-off-by: Łukasz Dziedziak <lukidzi@gmail.com>
Signed-off-by: Łukasz Dziedziak <lukidzi@gmail.com>
Signed-off-by: Łukasz Dziedziak <lukidzi@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lukidzi would it be possible to break this into logical commits? IMO it's hard to review a PR this big otherwise
@@ -129,7 +129,7 @@ test/e2e-kubernetes: $(E2E_DEPS_TARGETS) | |||
$(MAKE) test/e2e/k8s/start/cluster/kuma-1 | |||
$(MAKE) test/e2e/k8s/wait/kuma-1 | |||
$(MAKE) test/e2e/k8s/load/images/kuma-1 | |||
$(E2E_ENV_VARS) $(GINKGO_TEST_E2E) $(KUBE_E2E_PKG_LIST) || (ret=$$?; $(MAKE) test/e2e/k8s/stop/cluster/kuma-1 && exit $$ret) | |||
$(E2E_ENV_VARS) $(GINKGO_TEST_E2E) $(KUBE_E2E_PKG_LIST) || (ret=$$?; $(MAKE) test/e2e/k8s/stop/cluster/kuma-1 && exit $$ret) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this change need to be here?
@@ -129,7 +129,7 @@ test/e2e-kubernetes: $(E2E_DEPS_TARGETS) | |||
$(MAKE) test/e2e/k8s/start/cluster/kuma-1 | |||
$(MAKE) test/e2e/k8s/wait/kuma-1 | |||
$(MAKE) test/e2e/k8s/load/images/kuma-1 | |||
$(E2E_ENV_VARS) $(GINKGO_TEST_E2E) $(KUBE_E2E_PKG_LIST) || (ret=$$?; $(MAKE) test/e2e/k8s/stop/cluster/kuma-1 && exit $$ret) | |||
$(E2E_ENV_VARS) $(GINKGO_TEST_E2E) $(KUBE_E2E_PKG_LIST) || (ret=$$?; $(MAKE) test/e2e/k8s/stop/cluster/kuma-1 && exit $$ret) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
revert?
@@ -27,6 +27,8 @@ var _ config.Config = &Defaults{} | |||
|
|||
type Defaults struct { | |||
SkipMeshCreation bool `yaml:"skipMeshCreation" envconfig:"kuma_defaults_skip_mesh_creation"` | |||
// If true, instead of providing inbound clusters with address of localhost, generates cluster with ORIGINAL_DST |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also add a security threat if you disable it, and the fact that this will be removed.
@@ -97,6 +97,8 @@ const ( | |||
// Available values are: [trace][debug][info][warning|warn][error][critical][off] | |||
KumaEnvoyLogLevel = "kuma.io/envoy-log-level" | |||
|
|||
// KumaMetricsPrometheusAggregateAddress allows to specify which address for specific app should request for metrics | |||
KumaMetricsPrometheusAggregateAddress = "prometheus.metrics.kuma.io/aggregate-%s-address" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why would anyone need to set this to anything other than Pod IP?
var _ ClusterConfigurer = &CleanupIntervalConfigurer{} | ||
|
||
func (config *CleanupIntervalConfigurer) Configure(c *envoy_cluster.Cluster) error { | ||
c.CleanupInterval = &durationpb.Duration{Seconds: config.Interval} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't it just be added to PassThroughClusterConfigurer?
clusterBuilder = envoy_clusters.NewClusterBuilder(proxy.APIVersion). | ||
Configure(envoy_clusters.ProvidedEndpointCluster(localClusterName, false, core_xds.Endpoint{Target: endpoint.WorkloadIP, Port: endpoint.WorkloadPort})). | ||
Configure(envoy_clusters.Timeout(defaults_mesh.DefaultInboundTimeout(), protocol)) | ||
if endpoint.WorkloadIP != core_mesh.IPv4Loopback.String() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could use some explanation what is going on here. Why are we setting upstream bind config on this condition?
}) | ||
E2EAfterAll(func() { | ||
Expect(env.KubeZone1.TriggerDeleteNamespace(namespace)).To(Succeed()) | ||
Expect(env.KubeZone1.DeleteMesh(mesh)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how did it work... we should only be able to delete mesh from global cp
Expect(env.KubeZone1.TriggerDeleteNamespace(namespace)).To(Succeed()) | ||
Expect(env.KubeZone1.DeleteMesh(mesh)) | ||
Expect(env.UniZone1.DeleteMeshApps(mesh)).To(Succeed()) | ||
Expect(env.Global.DeleteMeshApps(mesh)).To(Succeed()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
global does not have mesh apps
Expect(env.KubeZone2.TriggerDeleteNamespace(namespace)).To(Succeed()) | ||
Expect(env.KubeZone2.DeleteMesh(mesh)) | ||
Expect(env.UniZone2.DeleteMeshApps(mesh)).To(Succeed()) | ||
Expect(env.Global.DeleteMeshApps(mesh)).To(Succeed()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
Expect(err).To(HaveOccurred()) | ||
}) | ||
|
||
It("should check communication k8s to universal", func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd consider splitting this to Describe("k8s to universal communication")
and many Its
Same for other its and second file with e2e tests
@@ -8,7 +8,7 @@ import ( | |||
) | |||
|
|||
var _ = Describe("MultiValueTagSet", func() { | |||
|
|||
EnableInboundPassthrough = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather see this in BeforeSuit / BeforeAll / It section. Right now, the scope of this change is not really clear to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job! I think I covered just half of the changes, I'll need another round
@@ -38,6 +38,9 @@ const ( | |||
TCPPortReserved = 49151 // IANA Reserved | |||
) | |||
|
|||
// We should remove it in the future version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe it should be formatted as todo
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and a linked issue to get rid of this
iface.WorkloadIP = "127.0.0.1" | ||
switch EnableInboundPassthrough { | ||
case true: | ||
if n.TransparentProxying != nil && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we use IsUsingInboundTransparentProxy
here?
@@ -431,6 +444,18 @@ func (d *Dataplane) GetIdentifyingService() string { | |||
return ServiceUnknown | |||
} | |||
|
|||
func (d *Dataplane) IsUsingInboundTransparentProxy() bool { | |||
return d.GetNetworking() != nil && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably just
return d.GetNetworking().GetTransparentProxying().GetRedirectPortInbound() != 0
is enough
@@ -73,6 +73,9 @@ message PrometheusAggregateMetricsConfig { | |||
// If false then the application won't be scrapped. If nil, then it is treated | |||
// as true and kuma-dp scrapes metrics from the service. | |||
google.protobuf.BoolValue enabled = 4; | |||
|
|||
// Address on which a service listen for incoming requests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Address on which a service listen for incoming requests. | |
// Address on which a service listens for incoming requests. |
@@ -27,6 +27,8 @@ var _ config.Config = &Defaults{} | |||
|
|||
type Defaults struct { | |||
SkipMeshCreation bool `yaml:"skipMeshCreation" envconfig:"kuma_defaults_skip_mesh_creation"` | |||
// If true, instead of providing inbound clusters with address of localhost, generates cluster with ORIGINAL_DST | |||
EnableInboundPassthrough bool `yaml:"enableInboundPassthrough" envconfig:"KUMA_DEFAULTS_ENABLE_INBOUND_PASSTHROUGH"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can imagine users just deleting enableInboundPassthrough
if they want to disable this behavior. So I'd rather call it DisableInboundPassthrough
and in that case deleting it in the future won't be a breaking change if you don't have this field in the config. Also DisableInboundPassthrough
gives the user a hint that this is a less desired behavior. WDYT?
@@ -125,6 +126,9 @@ func buildRuntime(appCtx context.Context, cfg kuma_cp.Config) (core_runtime.Runt | |||
)) | |||
} | |||
|
|||
// The setting should be removed, and there is no easy way to set it without breaking most of the code | |||
mesh_proto.EnableInboundPassthrough = builder.Config().Defaults.EnableInboundPassthrough |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we be sure there are no concurrent reads/writes to this variable?
} else { | ||
listenerPort = endpoint.DataplanePort | ||
inboundListenerName = envoy_names.GetInboundListenerName(endpoint.DataplaneIP, endpoint.DataplanePort) | ||
localClusterName = envoy_names.GetInboundClusterName(endpoint.WorkloadPort) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be:
localClusterName = envoy_names.GetInboundClusterName(endpoint. DataplanePort)
because otherwise localClusterName
is the same regardless EnableInboundPassthrough
localClusterName = envoy_names.GetInboundClusterName(endpoint.WorkloadPort) | ||
clusterBuilder = envoy_clusters.NewClusterBuilder(proxy.APIVersion). | ||
Configure(envoy_clusters.ProvidedEndpointCluster(localClusterName, false, core_xds.Endpoint{Target: endpoint.WorkloadIP, Port: endpoint.WorkloadPort})). | ||
Configure(envoy_clusters.Timeout(defaults_mesh.DefaultInboundTimeout(), protocol)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why timeouts are different depending on EnableInboundPassthrough
?
@@ -30,11 +42,18 @@ func (g ProbeProxyGenerator) Generate(ctx xds_context.Context, proxy *model.Prox | |||
virtualHostBuilder := envoy_routes.NewVirtualHostBuilder(proxy.APIVersion). | |||
Configure(envoy_routes.CommonVirtualHost("probe")) | |||
|
|||
portSet := map[uint32]bool{} | |||
inbounds := map[uint32]ProtocolInboundInfo{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd call it inboundsByPort
to be more obvious what's the key
I've redesigned the solution and I will create a new PR which is going to be easier to review. This solution might break the current matching of rules that's why I've implemented this a bit differently. Didn't want to push them here because a number of commits would make it unreadable. Here is the new PR #4750 |
Summary
When a user deploys Kuma the behavior of communication with services changes.
When there is no Kuma and the application running within the pod is binding to PodIP or Wildcard(
0.0.0.0
) other pods are able to reach the application. After Kuma is deployed it's not possible anymore. Instead, applications that bind tolocalhost
or wildcard are exposed outside. That's a big security threat. In this PR we are introducing a different way how the inbound listener is configured by default.Previously we had an inbound listener that had a static cluster that pointed to the address
localhost:PORT
. Now it's going to work a bit differently and might be breaking change for applications that were listening onlocalhost
.dataplane.networking.inbound[].serviceAddress
is not defined and transparent proxy is enabled:ORIGINAL_DST
so depends on the real Envoy is going to route traffic to the application:Flow:
POD1(
curl 10.2.0.5:8080
) -> kuma-dp(POD1)/or no kuma-dp -> kuma-dp (POD2) -> listener (inbound:8080
) -> cluster (inbound:8080
) which is original dst and is going to request10.2.0.5:8080
but with source IP127.0.0.6
-> iptables (MESH_OUTPUT) 1 rules -> application listening on0.0.0.0
or10.2.0.5
.dataplane.networking.inbound[].serviceAddress
is not defined and transparent proxy is disabled:dataplaneIP
because we are not able to get the IP address and we don't want to expose localhost. In this case, cluster is usingupstream_bind_config
to make a faster jump to the application in the first rule of iptables (MESH_OUTPUT) rule 1. This case is tricky and without this we could break health checking on universal.dataplane.networking.inbound[].serviceAddress
is set, we are just using it and creating a static cluster, with the same flow as above to return fast from IPTABLESWhat can I do to make the smooth upgrade?
localhost
and should be exposed outside - if no change to0.0.0.0
localhost
setdataplane.networking.inbound[].serviceAddress
to127.0.0.1
kuma-cp
configuration or env tokuma-cp
KUMA_DEFAULTS_ENABLE_INBOUND_PASSTHROUGH
- not recommended, we are going to remove this flag in the futureFull changelog
upstream_bind_address
for inbound clusters that don't set have localhost address]localhost
->inbound
]Issues resolved
Fix #4630
Documentation
Testing
Backwards compatibility
UPGRADE.md
with any steps users will need to take when upgrading.[ ] Addbackport-to-stable
label if the code follows our backporting policy