-
Notifications
You must be signed in to change notification settings - Fork 296
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes #36979 - Change cdn_ssl_version setting to cdn_min_tls_version #10824
Conversation
cd4050b
to
3b49686
Compare
3b49686
to
6417c9d
Compare
bfc17ac
to
6583cca
Compare
I disabled the TLSv1.3 unit test because CI infra doesn't know about TLSv1.3. I set TLSv1.2 as the default minimum version (the previous feature had no default) |
7f2f8b9
to
0a9dd1e
Compare
Dunno about impact on upgrades (i.e. does db migration make sense?), and apart of the typo it sounds great! |
That's a good question. I tested this by setting a value for the original setting then applying my patch and restarting the server, and found that it didn't cause any immediate issue. I also found the earlier commit 283c63d which removes a setting without any migration to remove the previous value from the DB. Maybe @ehelms or @parthaa can say if there is any reason a migration should be included with this one. |
0a9dd1e
to
7a0058d
Compare
I pushed an update which fixes the typos. |
What were the issues? Are there are other reasons not to defer to 1.3 as the default? |
fail("Invalid SSL version specified. Check the 'CDN SSL Version' setting") | ||
@cdn_min_tls_version = Setting[:cdn_min_tls_version] | ||
if @cdn_min_tls_version && !SUPPORTED_TLS_VERSIONS.include?(@cdn_min_tls_version) | ||
fail("Invalid TLS min_version specified. Check the 'CDN Minimum Allowed TLS Version' setting") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How will users see this failure? Does it propagate to the UI and API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other tricky part being that an organization admin will run into this issue and then have to contact a Satellite admin in order to modify this. I realize that's how the current functionality already works, but my point being we should make sure this is as actionable by a user encountering it as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fail()
should cause an ISE. I'm not sure if one would need to look at production.log to see the exact message.
That said, I think the only way this condition could be met is if the user has a value already set, and then that value later becomes unavailable (an upgrade to ruby or to the operating system removes the currently configured TLS version).
As long as we test on all supported platforms, I think our testing would catch that and appropriate intervention would be made before an end user would ever face that issue.
I kept and updated the check from the previous implementation (which seems also unlikely to be reached in real world usage) just in case.
If you still have concerns about it, maybe we can create a new issue to follow up? Let me know what you think, thanks.
7a0058d
to
b0b8102
Compare
How many users actually want this feature? It was introduced for compatibility. Since we moved to EL8 the SSLv3 option doesn't do anything (since OpenSSL 1.1.0+ is compiled without it) and I don't think people asked about this. I'd argue simplifying the code base is more important than the edge case where someone wants so specific control over the connections. And if they do, why only this one |
Well, there was a downstream BZ so someone is definitely trying to use it. I'd defer to @pmoravec about whether this is potentially useful or important since he has deep knowledge of user deployments and different scenarios they are facing, and to @ianballou or @parthaa about whether there's a concrete benefit for the health of the katello code base to remove it. |
https://bugzilla.redhat.com/show_bug.cgi?id=2216445 is from @pmoravec and the problem there is that the current default is less secure than if we dropped it (and can cause problems). I'll let him answer if allowing TLS 1.0+ (which will include TLS 1.3) but not giving explicit control is (as I suggested) solving the issue. |
The new setting would give users the option of reducing the min_version if they absolutely need to for compatibility with other infrastructure (syncing from the CDN via an older proxy, syncing from an older katello server using ISS) but by default it would enforce the more secure TLS1.3 standard. Given the existence of upgrade paths that involve building a new server on a newer OS and migrating content via ISS, which we've done in the past and could do again in the future, I think it's better to allow admins to quickly and easily modify the allowed min_tls version in the UI, rather than the more error-prone and higher risk process of modifying the system wide crypto policy. Given that the current version of Also as you already pointed out, Net::http has ssl_version (which we are setting nil unless the user changes that default) and min_version, corresponding to ssl_version (deprecated) and min_version in OpenSSL::SSL::SSLContext. So what really happens today would depend on the issue you discovered and fixed with ruby/openssl#710 , which would suggest we are allowing the insecure TLS 1.0 protocol for CDN syncs (...unless Net::http sets any defaults of its own instead of falling back to the ruby openssl library default?) Anyway, to summarize my concerns about removing it:
|
I don't have enough data from sufficiently many customers to claim how (or if) they use the feature in general. But I have seen a few of them where some weird proxy allowed just some specific kind of [authentication|SSLversion|http-only] that made communication impossible if not met. So I would honour for more configurable / fine-tuned option rather than dropping the config and allow just TLS1.0+. |
My main concerns are that this is just such a tiny aspect of all HTTP connections. Even communication to the CDN. Most of that happens via Pulp, so having such a setting is misleading to say the least. I suspect that most users who implement policies around this want to follow some policy. For example, that TLS < 1.2 shouldn't be used at all. Or that they connect to a proxy that doesn't support some TLS version. But having that only for a single connection doesn't make sense to me. Tomer was always on a quest to simplify the application by removing settings. That makes the application easier to use and I think this also falls in the same category.
This patch would only partially solve it, not for the whole platform. I don't think a partial fix is worth all the complexity. Both in maintenance and testing, but also for users consuming the product.
I don't think you really win a lot with this. TLSv1.2 is still considered secure and EL7 hosts don't have TLSv1.3 support, so you'd likely break a lot of deployments out of the box. Like where there's an EL7 HTTP proxy.
But what do you win with it? System wide crypto policies are probably better for compliance. For example, NIST SP 800-52 Rev. 2 mandates TLS 1.2 as the minimum version with support for TLS 1.3. So anyone who wants compliance with that should already set it via a system wide crypto policy.
It was there to allow for a downgrade in security. It was introduced in 280a77f and https://projects.theforeman.org/issues/8441 tells us that some proxies lacked TLS 1.2 support. I don't think that's relevant anymore. With my platform hat on I strongly feel we should drop the whole setting and that we should solve it project/product wide. That also means we need to verify Pulp uses the system wide policy too. From a UX and QE perspective it would also be easier. |
I know that you are aware of this, but to emphasize for anyone else reading along, if we don't set the
This is why I previously had TLSv1.2 as the default value, not TLSv1.3, although I can see the argument for setting TLSv1.3 anyway and accepting that users who can't use TLSv1.3 will need to intervene by lowering it to TLSv1.2. Either of these choices is preferable to the current status quo, and either of these choices is also preferable to dropping this feature immediately without going through design and planning, RFC, communicating the change, etc. |
N.B. I split this portion of my response into a separate comment, to emphasize that I see these details as being less urgent than the primary points I made above.
So it was introduced in #4823 and some proxies at the time lacked TLS1.2 support. While it is true that this exact circumstance is no longer exactly the case right now, I think it's very likely that a similar circumstance will arise again in the future as the world gradually (and unevenly) migrates to future OSes and TLS protocol versions. I think a partial fix is better than no fix at all. Previously this was configured via yaml and the installer which would have been a fine way to orchestrate katello and pulp for a consistent configuration, but that was removed and it was moved to the DB and controlled via settings API with #8552 , theforeman/puppet-katello#320 , and theforeman/foreman-installer#477 . I agree that it's important to follow up with an investigation of how Pulp handles TLS, and that it would be beneficial for users to be able to configure Pulp and Katello to use the same protocol. |
And I disagree with this. Remember that this is from the time when we didn't have unified proxy code. Today if you set up a proxy, you go through Foreman's HTTPProxy. That has code that patches every connection adapter. Imagine this scenario. You set the minimum TLS version for the CDN to be 1.3, that works. Then later you set a global proxy that doesn't understand TLS 1.3. Suddenly the CDN connection breaks, but all other connections work. How do you expect the user and anyone supporting the user to figure out why that one aspect doesn't work? My thought is that this current feature helps exactly nobody and introduces a maintenance and usability burden. It's making a tiny aspect of the application special and that's more likely to cause problems than solve any. I'm not a Katello maintainer, but with my platform hat on I'd say the broken setting needs to be removed without replacement. |
I am going to attempt to try and summarize what I think I understand the requirements to be to ensure I properly understand them before delving into the implementation details:
For #3, I avoided what I think is a key point, and requirement up for debate:
For #1, I think this is a best practice that we should be designing for across the entirety of our application. I believe the design hinges on #3 and the point that is up for debate. I have no love for over doing it with settings. I am wary of a solution that would give users no way to adjust the configuration for their environment (even if we think they are making poor security choices). Another way to put it is, if we opted not to provide a method to lower the TLS version, what would we tell a user who comes looking for help? |
Thank you for the summary.
Let's clarify what system-wide crypto policies specify. It's the minimum TLS version, the maximum and acceptable ciphers. This PR doesn't touch ciphers, so I'll ignore those for now. I'm going through it: comparing today, the current PR and my proposal where we drop the setting. Today: The CDN connection does not respect minimum TLS version, but worse, it doesn't respect the maximum TLS version. It limits to TLS 1.0 exactly. Current PR: The default value ( My suggestion: This would be equal to the default value of option 2.
I'm not sure if this is an option. RHEL 8 always has a system-wide setting for OpenSSL. The minimum TLS version is 1.2, maximum is 1.3. For context, Debian 11 (and I assume Debian 12) also has TLS 1.2 as the default minimum TLS version. So I I'd say 1) and 2) are the same case.
This is indeed the crux of the debate.
If we look at crypto-policies then there's this for RHEL 9 https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/html/security_hardening/using-the-system-wide-cryptographic-policies_security-hardening#excluding-an-application-from-following-the-system-wide-crypto-policies_using-the-system-wide-cryptographic-policies (and I realize quoting multiple sources may be confusing):
This would be what the current PR is implementing. But I'm questioning if it's adequate. If we implement this, what's the QE plan to verify it really does what it's supposed to? |
I thought the default min value was hardcoded for the current PR?
Perhaps the code should change the setting default to
This does indeed sound like a support annoyance, but isn't the setting for those few very special setups where the user should know they're doing something funky with their CDN connection anyway? This is definitely one of those weird corner-case features, but since it deals with import/export there are probably users doing some interesting and strange things related to disconnected environments.
I think the test would be to run through the original reproducer steps and verify that they now work:
Besides turning this into a full-blown feature where we let you set the min and max TLS version for all connections via the HTTP Proxy model (or something like that), it sounds like the other options are to accept this PR as-is or remove the setting entirely. Removing the setting entirely means saying "WONTFIX" to the disconnected users who have interesting TLS requirements, but we do gain some maintainability. I'm thinking removing the setting could cause complaints. |
On mobile, so limited quoting capabilities. But on that last point: as I understand it today the setting can enforce TLS 1.0 (no less, no more, thus insecure) or the SSLv3 setting (which I think is SSLv3 or TLSv1, but can't check now). OpenSSL 1.1.0 included in EL8 does NOT have SSLv3 compiled in. That means essentially it's TLSv1 or nothing else. In other words, it's broken |
Thanks for the thoughtful comments, everyone. So that we are all working from the same understanding, I want to clarify some very important points:
1.a. If the value This happens because we are currently using (deprecated) 1.b. The second breakage (what @ekohl discovered in
2.a. (Ewoud's proposal) drop 2.b. (This PR) drop My evaluation is that 2.b is the better approach for now, because it fixes both issues right now, while still allowing us to pursue either design in the long term (i.e. once the Ruby bug is fixed). My increasing concern is that we are debating the design merits of 2.a vs. 2.b as long term ideals, when we should instead move forward with the approach that fixes both flaws today and have a separate conversation about potentially deprecating the feature in the future (ideally, not in a PR review). What follows are in-line responses to various comments, which I've re-ordered for clarity. First, @ekohl
This could be the (at least partial) source of our disconnect? @ianballou got it right here:
That's correct. What I'm proposing is that we have to hardcode this for now to fix 1.b, and after sufficient time we could then either remove In response to @ehelms and @ianballou , who suggested
Neither 2.a nor 2.b would achieve this today, because if we don't explicitly set [1] https://github.com/ruby/net-http/blob/edc99a54b2c2888759068e2627f2f26c5f505352/lib/net/http.rb#L1635-L1643 Until we have a Ruby version with Ewoud's fix, we must explicitly set In response to these comments by @ehelms
This is a good question to ask about the future, and IMO strong arguments have been presented in favor of both approaches. With that said, we don't need to decide that right now, and it would be better to have that discussion on Discourse. Thinking about the below comments from @ianballou leads me to the same conclusion.
And this is one reason why I think it's better to go through RFC/Discourse first if we were to go that route eventually, rather than deprecating a feature (even a broken one) in a way that could be perceived as a shortcut to closing a bug.
And comments like these are the other reason I think we should take that aspect of the conversation to Discourse/RFC rather than making a hasty decision to deprecate a feature during a PR review. Deprecating a feature is a complex decision and should be carefully considered |
(2c from somebody who is not a typical developer) In a long term, relying only on system-wide crypto policy might seem a better approach to me, since that policy covers all connections (not only from katello) that are required for getting some content. I have no experience with such policies but I assume it supersets any setting we (ideally) allow by this feature - but system-wide, not "katello speaks to CDN" scenario only. If the policies allow what this feature is supposed to allow, how much sense it has to replicate the same functionality, for fine-tuned user scenario? (not a rhetorical question) Could there be user scenarios "we need lower SSL/TLS version just for this and that, but not globally / system-wide"? If not, we have an argument for dropping the feature. If so, we shall evaluate such corner cases if they deserve this feature. That is for a long term. For a short term, until the ruby PR from @ekohl ++ gets into the ruby version that katello uses, we can treat this PR as a "hotfix". So (at least) for a short term, fixing this feature seems to me as a better option than dropping it - we can re-evaluate once the ruby fix gets to katello. |
Just jumping on this: the current setting is harmful and there is no logical migration path from one to the other. It's not like you want to store the value that the user has today and transform it something else so we can drop it without deprecation IMHO. Having a discussion about a replacement is a good RFC topic though. |
Never mind this, Ewoud answered it above |
Now that I understand the situation better, I get that we have an immediate need to bump up the minimum TLS version so that TLSv1 isn't used even for users who do not set the setting in question. The current PR sets TLS to be a secure setting by default. It also lets users do insecure things for whatever unique environment they have. By that logic, I'm happy with the general strategy. Please let me know if I'm missing something. I can do an actual functionality test on the code once there's a consensus. |
The really important point to me is that the harm exists even if the setting is removed. I am worried about something like a MITM attack that exploits the (always allowed) insecure protocol. Maybe that needs to be chained with other exploits to compromise the supply chain. I am not comfortable with that risk.
The current value of We aren't forced to do this, we still have (in the future) the option of dropping
👍 from me. I want to emphasize again that I am hearing some good arguments in favor of dropping
This is how I see it too. Sincerely, thank you to everyone for the time, effort, and patience involved in figuring this out. |
The whole of the project is vulnerable to that. Why is the CDN connection special? To perform a successful MITM attack you would also need to spoof the DNS resolution and the certificate. If you can do that, providing it over TLS 1.2 or 1.3 isn't a big challenge. Preventing MITM can best be done by certificate pinning and verifying the certificate. AFAIK certificate pinning is what many mobile applications do and it turns out we already do this in Katello. cdn.redhat.com doesn't use a globally signed certificate so we bundle it in Katello (https://github.com/Katello/katello/blob/master/ca/redhat-uep.pem). I think the verification happens, but not entirely sure what this code does: katello/app/lib/katello/resources/cdn.rb Line 32 in cc1b7cc
I would have expected it to use some constant, but value 9 is the bitwise OR of 1 and 8. Reverse engineering this, we know these are the various values:
Since 8 undefined, I think this is equivalent to We should change this magic So 'm not worried about MITM attacks because we apply better protections already. But perhaps I'm wrong and with TLS 1.0 you can MITM even with certificate verification.
If we do this, the applications raises a warning on every start up, which is a bad user experience.
How are you going to convert the current value? I assumed you can't really differentiate between the current value and the new, but now that you mention it, perhaps you can. |
I forgot about this until @evgeni reminded me of it: ruby/net-http#55 means that HTTPS proxies are currently not supported. That means the only way I can think of why this was needed is Deep Packet Inspection. In the past many of these middleware boxes broke old TLS support. If you can't support TLS 1.2 these days, large parts of the internet are no longer reachable. So I wouldn't be really concerned with supporting them. I'd also say it's questionable if they were ever supported. |
That's not my reading of https://bugs.ruby-lang.org/issues/16482 ... "Note that this issue is not about the connection that is routed through the proxy but about the connection to the proxy itself." |
That's exactly my point. From the Foreman perspective, if you use a proxy today you MUST have: Foreman -> HTTP -> Proxy -> HTTP(S) -> target You can't have: Foreman -> HTTPS -> Proxy -> HTTP(S) -> target Foreman can't enforce a minimal TLS version on the Proxy -> target connection. So |
I tend to agree it makes less sense to have a dedicated SSL/TLS-version setting just for one particular service or call flow, esp. if it has some further limitations mentioned above. If there is a system-wide config option for the same, that would make much more sense (and be the argument to just remove this setting). |
OK, and what if you don't have an HTTP proxy?
There is, but Ruby doesn't respect it currently, so unless we are explicit in specifying it ourselves, TLSv1.0 (an insecure, out of date protocol) is always used as the min_tls_version. I think we've all been in agreement that this setting could be removed in the long term when the Ruby bug is fixed and the system wide crypto policy can actually be respected. |
HTTPS proxies aside, this setting still enables syncing Foreman from some anciently configured Foreman that requires old TLS right? (ISS) |
There hasn't been much movement for a while, and understandably so given the conversation. However, in following the debate above on "keep the cdn_ssl_version setting" vs "remove the setting", I'd like to confirm: Removing "cdn_ssl_version" will stop users from being able to sync an older (TLS v1) upstream Foreman with a newer downstream Foreman. If this is true, I'd like more outside opinions about if we need to continue supporting this feature. If it's false and either that feature is already broken or maybe I'm just misunderstanding the debate above, then I would be more in support of removing the setting entirely and raising the min tls version to the lowest secure version. |
Closing this, as we addressed it with #10926 Thanks all! |
This uses the min_version attribute from Net::HTTP instead of ssl_version, which in turn uses OpenSSL::SSL::SSLContext#min_version and not the older OpenSSL::SSL::SSLContent#ssl_version. As a result the there is no longer a blanket SSLv23 setting and the data format for specifying TLS versions has changed.
It also sets a default value of the new setting for TLSv1.2, which is reasonable because older versions of the standard are deprecated. Users requiring a deprecated protocol version due to their network infrastructure can lower this value as needed.
Considerations taken when implementing this change?
min_version
andssl_version
in OpenSSL, sincessl_version
is deprecated? I opted to introduce a new setting entirely since the functionality has changed and we can't directly map the old allowed values onto the new allowed values.test/models/setting_test.rb
as it seems thattest/lib/resources/cdn_test.rb
provides adequate coverage.What are the testing steps for this pull request?
CDN min TLS version
value and re-try inter-server sync. Now it should succeed once older TLS has been permitted.Open Questions