Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

upstream: Null-deref on TCP health checker if setsockopt fails #6793

Merged
merged 13 commits into from
May 10, 2019

Conversation

andrewjjenkins
Copy link
Contributor

Description: Fixes a nullptr dereference that can happen if a TCP Health Checker connection gets a LocalClose. This can happen if the connection's attempt to set socket options like TCP_KEEPIDLE fails.
Risk Level: Low. client_ is nulled and held temporarily but client_ could already be null before.
Testing: Additional unit test added. As well, have tested the ossfuzz repro and some other paths in the details.
Docs Changes: None
Release Notes: N/A (I'm new here - do we typically relnote fuzz bugs?)

Fixes no-longer-embargoed ossfuzz 11100

This is a re-attempt at the closed #6422 .

If a TcpHealthChecker client is closed but timeout_timer_ not disabled, then when timeout_timer_ later fires, client_ is dereferenced but it is now nullptr, causing a crash.

The particular ossfuzz test case is one path:

  • Have a TCP health checker with a config like: upstream_connection_options.tcp_keepalive.keepalive_time: {}
  • This causes Envoy to ask to set TCP_KEEPIDLE to 0, which gets EINVAL
setsockopt(53, SOL_SOCKET, SO_KEEPALIVE, [1], 4) = 0
setsockopt(53, SOL_TCP, TCP_KEEPIDLE, [0], 4) = -1 EINVAL (Invalid argument)
  • A LocalClose is triggered but timeout_timer_ isn't disabled.
  • Later, timeout_timer_ fires and the client_ nullptr is dereferenced.

There is a possibility of other paths. Some other paths like file descriptor exhaustion cause a different, intentional Envoy crash. Limiting /proc/sys/net/ip_local_port_range so that time-wait sockets exhaust ephemeral port range causes connect() to fail, which triggers RemoteClose that is correctly handled as a failure. So the general case I'm solving here "any call to setsockopt() fails causing a LocalClose".

Andrew Jenkins added 2 commits May 2, 2019 18:24
Fixes no-longer-embargoed ossfuzz 11100:
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=11100&q=envoy&colspec=ID%20Type%20Component%20Status%20Proj%20Reported%20Owner%20Summary

If a TCP health checker connection receives a LocalClose from an
external source, it doesn't halt the timeout_timer_.  When the timeout
timer fires, this triggers a dereference in
TcpHealthCheckerImpl::TcpActiveHealthCheckSession::onTimeout().

The particular case that triggers this is that the configuration asks
for a TCP socket keepalive idle timeout of "0", which is invalid.  This
happens with health checker configs that look like:

    upstream_connection_options.tcp_keepalive.keepalive_time: {}

Which eventually causes a call to setsockopt() like (according to
strace):

    setsockopt(53, SOL_TCP, TCP_KEEPIDLE, [0], 4) = -1 EINVAL (Invalid argument)

this causes ClientConnectionImpl::ClientConnectionImpl() to schedule an
immediate_error_event_ = ConnectionEvent::LocalClose.

This eventually causes invocation of
TcpHealthCheckerImpl::TcpActiveHealthCheckSession::onEvent(ConnectionEvent::LocalClose).
This code assumes that LocalCloses are all triggered by that module,
like due to onTimeout() or successfully completing a health check but
not reusing the connection.  We have another case - the LocalClose is
triggered by a local failure creating the connection (in our case,
setsockopt()).

onEvent() needs to differentiate between "I am handling a LocalClose
triggered by local TcpHealthCheckerImpl logic" and "I am handling a
LocalClose that was triggered by some other layer of Envoy".  This
commit changes local logic to std::move(client_) to a temporary
UniquePtr before triggering the local close.  Now, onEvent can detect
externally-triggered closes and handle those as failures.

Signed-off-by: Andrew Jenkins <andrew@volunteers.acasi.info>
Ossfuzz issue 11100 happens when a connection attempt triggers a
Network::ConnectionEvent::LocalClose.  The code previously assumed that
LocalCloses were all due to TcpHealthCheckerImpl-triggered closes, like
timeouts or destruction.  However, LocalCloses can also come from other
layers of Envoy.  Add a unit test for this path, and verify that the
TcpHealthCheckerImpl now treats this path as a health check failure
(analogous to if the call to connect() had returned ECONNREFUSED).

Signed-off-by: Andrew Jenkins <andrew@volunteers.acasi.info>
@andrewjjenkins
Copy link
Contributor Author

andrewjjenkins commented May 2, 2019

@mattklein123 - This is the re-opening of #6422. Where we left off, you had recommended disabling the timeout timer in onEvent(). Heads up, I took a different approach here but I'm open to what direction you think best.

Repeating my comment about why:

I explored disabling the timeout timer in onEvent() and there are two downsides:

  1. The onEvent(Connection::LocalClose) path is also invoked via the TcpHealthCheckerImpl::TcpActiveHealthCheckSession::~TcpActiveHealthCheckSession() and also healthy closes (like a health check completes but reuse_connection_ is false) so I end up needing to do a good deal of unit test surgery to EXPECT_CALL() for things only invoked via destructor or happy-path things. (Along with the double-disable on every happy path health check)

  2. This approach treats a LocalClose as an unhandled halt - no change to health state, and the health check is never re-attempted (because we never called handleFailure or handleSuccess and therefore never started interval_timer_). If a system call to setsockopt() fails, I think it might make more sense to treat it via handleFailure, instead of never probing again.

So I've got some code that instead treats LocalCloses that aren't from ourselves as failures, causing us to safely retry after the interval. Of course, no guarantee that anything will change and we'll succeed when we retry.

@andrewjjenkins
Copy link
Contributor Author

Along with the unit test, I was reproducing with this change to google_com_proxy.v2.yaml:

@@ -50,3 +50,12 @@ static_resources:
                 port_value: 443
     tls_context:
       sni: www.google.com
+    health_checks:
+      timeout: 1s
+      interval: 3s
+      unhealthy_threshold: 3
+      healthy_threshold: 3
+      tcp_health_check: {}
+    upstream_connection_options:
+      tcp_keepalive:
+        keepalive_time: {}

repro.yaml.txt

}
} else {
host_->setActiveHealthFailureType(Host::ActiveHealthFailureType::UNHEALTHY);
}
}

void TcpHealthCheckerImpl::TcpActiveHealthCheckSession::onEvent(Network::ConnectionEvent event) {
if (event == Network::ConnectionEvent::RemoteClose) {
// If !client_, then we are already handling a failure/teardown
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Key is this. I'm using !client_ as a signal that this is an internally-generated RemoteClose or LocalClose. I need some way to distinguish so I know that I need to call handleFailure().

(Other ways to distinguish internally-generated from externally-generated would work too)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other way to do this would be to do what the other health checkers are doing which is effectively the expect_reset boolean. I'm fine either way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated to follow the expect_reset_ boolean pattern for consistency with the other health checkers, thanks for the suggestion.

Signed-off-by: Andrew Jenkins <andrew@volunteers.acasi.info>
@mattklein123 mattklein123 self-assigned this May 2, 2019
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks this approach LGTM at a high level. I left a couple of comments. Can you merge master? It will need a bit of merge due to my recent health check fix. Thank you!

/wait

}
} else {
host_->setActiveHealthFailureType(Host::ActiveHealthFailureType::UNHEALTHY);
}
}

void TcpHealthCheckerImpl::TcpActiveHealthCheckSession::onEvent(Network::ConnectionEvent event) {
if (event == Network::ConnectionEvent::RemoteClose) {
// If !client_, then we are already handling a failure/teardown
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other way to do this would be to do what the other health checkers are doing which is effectively the expect_reset boolean. I'm fine either way.

source/common/upstream/health_checker_impl.cc Outdated Show resolved Hide resolved
…heckers

Signed-off-by: Andrew Jenkins <andrew@volunteers.acasi.info>
Signed-off-by: Andrew Jenkins <andrew@volunteers.acasi.info>
@andrewjjenkins
Copy link
Contributor Author

Hey @mattklein123 - good idea with expect_reset_. That makes it a lot cleaner. I bet it will be easier to review now by looking at the files changed instead of commit-by-commit. Thanks!

Merge from master done as well.

Once you're happy I can rebase it all down to one commit on master if that helps.

Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks this looks like a great approach. One comment. You don't need to squash your PR, we do that during merge. Thank you!

/wati

source/common/upstream/health_checker_impl.cc Outdated Show resolved Hide resolved
source/common/upstream/health_checker_impl.cc Outdated Show resolved Hide resolved
Andrew Jenkins added 2 commits May 7, 2019 19:25
Signed-off-by: Andrew Jenkins <andrew@volunteers.acasi.info>
Signed-off-by: Andrew Jenkins <andrew@volunteers.acasi.info>
timeout_timer_->callback_();
EXPECT_EQ(cluster_->prioritySet().getMockHostSet(0)->hosts_[0]->getActiveHealthFailureType(),
Host::ActiveHealthFailureType::TIMEOUT);
EXPECT_EQ(Host::Health::Unhealthy, cluster_->prioritySet().getMockHostSet(0)->hosts_[0]->health());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's where I check that two timeouts in a row cause us to mark this host unhealthy

Signed-off-by: Andrew Jenkins <andrew@volunteers.acasi.info>
Signed-off-by: Andrew Jenkins <andrew@volunteers.acasi.info>
Signed-off-by: Andrew Jenkins <andrew@volunteers.acasi.info>
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks. 1 small thing and 1 request for v1 -> v2 conversion if you are up for it.

/wait

@@ -252,6 +252,9 @@ class TcpHealthCheckerImpl : public HealthCheckerImplBase {
TcpHealthCheckerImpl& parent_;
Network::ClientConnectionPtr client_;
std::shared_ptr<TcpSessionCallbacks> session_callbacks_;
// If true, stream reset was initiated by us, not e.g. remote reset.
// In this case healthcheck status already reported, only state cleanup required.
bool expect_reset_{};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can you make this expect_close_ and update the comment above to talk about close vs. reset? There is no reset happening for TCP.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

dispatcher_, runtime_, random_,
HealthCheckEventLoggerPtr(event_logger_)));
health_checker_.reset(
new TcpHealthCheckerImpl(*cluster_, parseHealthCheckFromV1Json(json.str()), dispatcher_,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as you are in here would you mind updating these tests to use v2 YAML instead of JSON as @derekargueta has been doing elsewhere? This would really help our v1 deprecation efforts.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, these were the last users of parseHealthCheckFromV1Json so I removed that too

Andrew Jenkins and others added 2 commits May 10, 2019 19:30
…y protocol stream resets)

Signed-off-by: Andrew Jenkins <andrewj@f5.com>

Signed-off-by: Andrew Jenkins <andrew@volunteers.acasi.info>
…mV1Json

Signed-off-by: Andrew Jenkins <andrew@volunteers.acasi.info>
healthy_threshold: 2
tcp_health_check:
send:
text: "01"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm reading the proto file correctly, this is the right way to send binary encoded as hex stream (and the unit tests that use this still pass), but wouldn't mind a double-check here.

// Hex encoded payload. E.g., "000000FF".
string text = 1 [(validate.rules).string.min_bytes = 1];

Signed-off-by: Andrew Jenkins <andrew@volunteers.acasi.info>
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work. Thank you!

@mattklein123 mattklein123 merged commit 5a92867 into envoyproxy:master May 10, 2019
mpuncel added a commit to mpuncel/envoy that referenced this pull request May 10, 2019
* master: (88 commits)
  upstream: Null-deref on TCP health checker if setsockopt fails  (envoyproxy#6793)
  ci: switch macOS CI to azure pipelines (envoyproxy#6889)
  os syscalls lib: break apart syscalls used for hot restart (envoyproxy#6880)
  Kafka codec: precompute request size before serialization, so we do n… (envoyproxy#6862)
  upstream: move static and strict_dns clusters to dedicated files (envoyproxy#6886)
  Rollforward of api: Add total_issued_requests to Upstream Locality and Endpoint Stats. (envoyproxy#6692) (envoyproxy#6784)
  fix explicit constructor in copy-initialization (envoyproxy#6884)
  stats: use tag iterator rather than constructing the tag-array and searching that. (envoyproxy#6853)
  common: use unscoped build target in generate_version_linkstamp (envoyproxy#6877)
  Addendum to envoyproxy#6778 (envoyproxy#6882)
  ci: add minimum Linux build for Azure Pipelines (envoyproxy#6881)
  grpc: utilities for inter-converting grpc::ByteBuffer and Buffer::Instance. (envoyproxy#6732)
  upstream: allow excluding hosts from lb calculations until initial health check (envoyproxy#6794)
  stats: prevent unused counters from leaking across hot restart (envoyproxy#6850)
  network filters: add `injectDataToFilterChain(data, end_stream)` method to network filter callbacks (envoyproxy#6750)
  delete things that snuck back in (envoyproxy#6873)
  config: scoped rds (2b): support delta APIs in ConfigProvider framework (envoyproxy#6781)
  string == string! (envoyproxy#6868)
  config: add mssing imports to delta_subscription_state (envoyproxy#6869)
  protobuf: add missing default case to enum (envoyproxy#6870)
  ...

Signed-off-by: Michael Puncel <mpuncel@squareup.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants