Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.NET Core 2.1 SocketsHttpHandler does not use Negotiate / SPNego #27415

Closed
CJHarmath opened this issue Sep 17, 2018 · 37 comments · Fixed by dotnet/corefx#33426
Closed

.NET Core 2.1 SocketsHttpHandler does not use Negotiate / SPNego #27415

CJHarmath opened this issue Sep 17, 2018 · 37 comments · Fixed by dotnet/corefx#33426
Assignees
Labels
bug tenet-compatibility Incompatibility with previous versions or .NET Framework
Milestone

Comments

@CJHarmath
Copy link

Overview

While testing PowerShell Core 6.1 I ran into an issue with not being able to authenticate to a Kerberized REST API running on Linux unless I disable SocketsHttpHandler.
PowerShell/PowerShell#7801

It seems like that when the server responds with both Negotiate and NTLM, the SocketsHttpHandler picks NTLM which in my case results in a 401 as the service in question is really expecting Negotiate / SPNego and is not working with NTLM.

As requested by @karelz in https://github.com/dotnet/corefx/issues/30166
I've reproduced it on the daily builds without PowerShell Core involved and same results, so submitting a new issue for this.

Expected result

When server sends multiple auth schemes like Negotiate and NTLM, pick the strongest one which in this case is Negotiate.

Dotnet Info

C:\dev\test\httpclient-spnego\test2>dotnet --info
.NET Core SDK (reflecting any global.json):
 Version:   2.1.403-servicing-009270
 Commit:    def6c5f48d

Runtime Environment:
 OS Name:     Windows
 OS Version:  10.0.15063
 OS Platform: Windows
 RID:         win10-x64
 Base Path:   C:\Program Files\dotnet\sdk\2.1.403-servicing-009270\

Host (useful for support):
  Version: 2.1.5-servicing-26911-03
  Commit:  efdba896f7

.NET Core SDKs installed:
  2.1.201-preview-007614 [C:\Program Files\dotnet\sdk]
  2.1.202 [C:\Program Files\dotnet\sdk]
  2.1.400 [C:\Program Files\dotnet\sdk]
  2.1.402 [C:\Program Files\dotnet\sdk]
  2.1.403-servicing-009270 [C:\Program Files\dotnet\sdk]

.NET Core runtimes installed:
  Microsoft.AspNetCore.All 2.1.2 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All]
  Microsoft.AspNetCore.All 2.1.4 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All]
  Microsoft.AspNetCore.All 2.1.5-rtm-31008 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All]
  Microsoft.AspNetCore.App 2.1.2 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 2.1.4 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 2.1.5-rtm-31008 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.NETCore.App 2.0.7 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 2.0.9 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 2.1.2 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 2.1.4 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 2.1.5-servicing-26911-03 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]

To install additional .NET Core runtimes or SDKs:
  https://aka.ms/dotnet-download

Example repro

var handler = new HttpClientHandler
{
    UseDefaultCredentials = true,
    AllowAutoRedirect = true,
};
using (var client = new HttpClient(handler))
{
    var res = client.SendAsync(new HttpRequestMessage(HttpMethod.Get, uri)).GetAwaiter().GetResult();
    System.Console.WriteLine(res);
}

Result: 401

HTTP traffic from packet capture

GET / HTTP/1.1
Host: mykerberossite.lab.local

HTTP/1.1 401 Unauthorized
Date: Mon, 17 Sep 2018 21:31:42 GMT
Server: Apache-Coyote/1.1
WWW-Authenticate: Negotiate
WWW-Authenticate: NTLM
Content-Length: 0

GET / HTTP/1.1
Authorization: Negotiate ****
Host: mykerberossite.lab.local

HTTP/1.1 401 Unauthorized
Date: Mon, 17 Sep 2018 21:31:42 GMT
Server: Apache-Coyote/1.1
WWW-Authenticate: NTLM
Content-Length: 0

Workaround is to disable SocketsHttpHandler

AppContext.SetSwitch("System.Net.Http.UseSocketsHttpHandler", false);
var handler = new HttpClientHandler
{
    UseDefaultCredentials = true,
    AllowAutoRedirect = true,

};
using (var client = new HttpClient(handler))
{
    var res = client.SendAsync(new HttpRequestMessage(HttpMethod.Get, uri)).GetAwaiter().GetResult();
    System.Console.WriteLine(res);
}

result: 200

HTTP Traffic

GET / HTTP/1.1
Connection: Keep-Alive
Host: mykerberosite.lab.local

HTTP/1.1 401 Unauthorized
Date: Mon, 17 Sep 2018 21:30:27 GMT
Server: Apache-Coyote/1.1
WWW-Authenticate: Negotiate
WWW-Authenticate: NTLM
Content-Length: 0
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive

GET / HTTP/1.1
Connection: Keep-Alive
Host: mykerberossite.lab.local
Authorization: Negotiate ***

HTTP/1.1 200 OK
Date: Mon, 17 Sep 2018 21:30:27 GMT
Server: Apache-Coyote/1.1
WWW-Authenticate: Negotiate ***
Cache-Control: no-cache
Expires: -1
Content-Type: text/plain;charset=UTF-8
Content-Length: 103
Keep-Alive: timeout=5, max=99
Connection: Keep-Alive

@davidsh
Copy link
Contributor

davidsh commented Sep 22, 2018

It seems like that when the server responds with both Negotiate and NTLM, the SocketsHttpHandler picks NTLM which in my case results in a 401 as the service in question is really expecting Negotiate / SPNego and is not working with NTLM.

Your attached output from the wire trace doesn't show that. The client-side HTTP is picking Negotiate.

GET / HTTP/1.1
Authorization: Negotiate ****
Host: mykerberossite.lab.local

However, the server is still not authenticating and returning back a 401.

@CJHarmath
Copy link
Author

CJHarmath commented Sep 22, 2018

Your attached output from the wire trace doesn't show that. The client-side HTTP is picking Negotiate.

Hmm, could be a copy paste error. Will re-test on Monday and update.

@CJHarmath
Copy link
Author

Hey, sorry for the slow response...

So I've checked again and it's indeed using NTLM with that Negotiate response ( NTLM_NEGOTIATE).
It was just not obvious with my *-ed out HTTP traffic as it didn't show the size of that blob.
NTLM is way smaller than Kerberos so you can just eye ball it.

Wireshark can actually dissect and show that it's NTLMSSP_NEGOTIATE as show below.

image

And here is the snip when I disable SocketsHttpHandler

image

@davidsh
Copy link
Contributor

davidsh commented Sep 27, 2018

So I've checked again and it's indeed using NTLM with that Negotiate response ( NTLM_NEGOTIATE).

So, what you're really saying is that the HTTP stack correctly responded with Negotiate scheme. But Negotiate ended up "negotiating" NTLM instead of Kerberos.

That happens a lot when the requirements for a valid Kerberos infrastructure don't exist. For example, if the client machine is not joined to the Windows Active Directory (or Linux Kerberos) domain of the server, or timestamps aren't matching etc.

In your example, the server is a Linux machine using Kerberos. I'm assuming that the Linux machine is also the Kerberos ticketing server, or is there a separate machine for that?

Usually, this kind of problem is a configuration problem with the machines and not a problem with the client-side HTTP stack since it is picking Negotiate scheme correctly.

cc: @wfurt @geoffkizer @karelz

@CJHarmath
Copy link
Author

TLDR: Unlike the rest of the Windows web clients (browsers, .NET full, etc) SocketsHTTPHandler is not canonicalizing the given host when trying to request the SPN which breaks Kerberos if the Url has a CNAME and the SPN is only on the DNS A record of the host.

Details
While writing a pretty long reply linking to docs on MSDN about how Kerberos is implemented on windows in terms of SPNego's selection of auth mechanisms, etc etc and preparing the example traces I just found the issue why it's not working.

phew.

This is what's happening in very high level when it's working as expected ( after purged krb tickets and flushed DNS cache - which actually helped me nailing it down - always purge and always flush!)

  • DNS query of HOST
  • HTTP GET
  • HTTP Response with 401 Negotiate header
  • DNS lookup of the host to get the A record
  • KRB5 TGS_REQ to AD DC for HTTP/DNS_A
  • KRB5 TGS-REP with referral to the Linux KDC realm
  • DNS lookup for SRV _kerberos....
  • KRB5 TGS_REQ to Linux KDC for HTTP/DNS A
  • KRB5 TGS_REP with the service ticket
  • HTTP GET with Negotiate with that ticket
  • HTTP Response with 200

And this is what's happening with SocketsHttpHandler

  • DNS query of HOST
  • HTTP GET
  • HTTP Response with 401 Negotiate header
  • KRB5 TGS_REQ to AD DC for HTTP/HOSTNAME_AS_IS ( which can be a CNAME )
  • KRB5 ERROR - principal unknown
  • HTTP GET NTLMSSP_NEGOTIATE
  • HTTP RESPONSE 401

So the reason SocketsHTTPHandler is not working is because it's trying to find the SPN for the CNAME instead of canonicalizing it with a forward lookup.
While Windows SSPI is not canonicalizing, all applications are. IE, Firefox, Chrome, .NET web clients, etc are all canonicalizing.
i.e.: coming from .NET Full this is a breaking change.

So SocketsHTTPHandler would need to stand in line and stay consistent.
Perhaps allow a flag to not canonicalize but have the default to do whatever IE,Chrome, etc .NET full web is doing.

Linux
On Linux this is configured via krb5.conf so SocketsHTTPHandler should honor that setting.
rdns = true|false

@davidsh
Copy link
Contributor

davidsh commented Sep 27, 2018

TLDR: Unlike the rest of the Windows web clients (browsers, .NET full, etc) SocketsHTTPHandler is not canonicalizing the given host when trying to request the SPN which breaks Kerberos if the Url has a CNAME and the SPN is only on the DNS A record of the host.

Thanks for the additional details.

SocketsHttpHandler on Windows uses the Windows SSPI libraries for doing Negotiate and NTLM protocols. I do not think it is something that is directly controllable by SocketsHttpHandler. It would need to be investigated further.

We have tested Linux clients against a Windows server/ActiveDirectory domain. We've demonstrated that Negotiate scheme will use Kerberos if all the machines are configured properly. See: #26418.

But we have not extensively tested a Windows client using Negotiate (Kerberos) into a Linux server environment.

@mattpwhite
Copy link
Contributor

In Linux, canonicalization is controlled by configuration knobs in krb5.conf. Applications are explicitly not supposed to do any kind of canonicalization there and rely on the GSSAPI implementation and its configuration.

In Windows, SSPI does not canonicalize, and AFAIK, there are no global knobs to turn. But many applications, including all of the major browsers, WinHTTP (and thus legacy .NET HTTP clients), and most (but not all) OS components do forward canonicalize CNAMEs prior to calling into SSPI. People/things now generally expect that behavior, witness the outcry when Chrome accidentally stopped pre-canonicalizing recently: https://bugs.chromium.org/p/chromium/issues/detail?id=872665

@CJHarmath
Copy link
Author

But we have not extensively tested a Windows client using Negotiate (Kerberos) into a Linux server environment.

My guess is that this will reproduce on windows->windows as well ( haven't tested it yet).
i.e.: Setup IIS with negotiate:kerberos as the only windows auth provider. Configure the Restrict NTLM GPO, then have an SPN for for the A and use a CNAMe to access it.
If configured correctly SockesHttpHandler will break as it will try the wrong SPN then can't use NTLM while the legacy .NET client will work as is.

@davidsh
Copy link
Contributor

davidsh commented Sep 27, 2018

But many applications, including all of the major browsers, WinHTTP (and thus legacy .NET HTTP clients), and most (but not all) OS components do forward canonicalize CNAMEs prior to calling into SSPI.

So, this scenario does work using WinHttpHandler (WinHTTP) on the client-side? So, did you turn off SocketsHttpHandler (via AppContext switch for example) and demonstrate that the scenario works?

See: https://github.com/dotnet/core/blob/master/release-notes/2.1/2.1.0.md

Networking Performance
You can use one of the following mechanisms to configure a process to use the older HttpClientHandler:
From code, use the AppContext class:
AppContext.SetSwitch("System.Net.Http.UseSocketsHttpHandler", false);
The AppContext switch can also be set by config file.
The same can be achieved via the environment variable DOTNET_SYSTEM_NET_HTTP_USESOCKETSHTTPHANDLER. To opt out, set the value to either false or 0.

If this works with WinHttpHandler, then it should be possible to fix it for SocketsHttpHandler. WinHttpHandler uses native WinHTTP which uses the same Windows SSPI libraries as SocketsHttpHandler for doing Negotiate and NTLM.

@davidsh
Copy link
Contributor

davidsh commented Sep 27, 2018

@stephentoub

@CJHarmath
Copy link
Author

So, this scenario does work using WinHttpHandler (WinHTTP) on the client-side? So, did you turn off SocketsHttpHandler (via AppContext switch for example) and demonstrate that the scenario works?

Yep. see my very first post with the below workaround to make this work
AppContext.SetSwitch("System.Net.Http.UseSocketsHttpHandler", false);

@geoffkizer
Copy link
Contributor

I think we just need to use the canonical host name here (as returned by Dns.GetHostEntry) instead of the hostname in the uri. Correct?

https://github.com/dotnet/corefx/blob/2c8be59ed4405dbdd91777f79763cb2f1384729e/src/System.Net.Http/src/System/Net/Http/SocketsHttpHandler/AuthenticationHelper.NtAuth.cs#L80

@mattpwhite
Copy link
Contributor

Almost. That would handle CNAMEs and partially qualified names of As (that become fully qualified when the OS resolver appends one of the configured search suffixes). But...

  • If you pass an IP to GetHostEntry(), it will reverse it with a PTR lookup. I believe this behavior would be unexpected. Would probably want to be conditional on a failing IPAddress.TryParse.
  • GetHostEntry() will potentially use legacy and broadcast based name resolution protocols on Windows. These are not useful for canonicalization and slow down negative responses. A potential optimization would be to pass the relevant flags to the underlying Win32 APIs to not consider LLMNR, NetBIOS.

@davidsh
Copy link
Contributor

davidsh commented Oct 19, 2018

@mattpwhite Thank you for the added details regarding CNAMEs and the issues with LLMNR, NETBIOS, etc.

Since you observe the correct behavior in .NET Framework, we will look at that implementation to see where it differs from .NET Core SocketsHttpHandler. That will give us more insight into the correct implementation.

@davidsh davidsh self-assigned this Oct 22, 2018
@davidsh
Copy link
Contributor

davidsh commented Oct 22, 2018

I've done some research into why this works on .NET Framework.

.NET Framework does make sure to do canonicalization when it is computing the proper SPN to use:

AuthenticationState.GetComputeSpn
https://github.com/Microsoft/referencesource/blob/master/System/net/System/Net/_AuthenticationState.cs#L114

https://github.com/Microsoft/referencesource/blob/master/System/net/System/Net/_AuthenticationState.cs#L145

which then calls internal method Dns.TryInternalResolve
https://github.com/Microsoft/referencesource/blob/master/System/net/System/Net/DNS.cs#L553

So, we would need to use similar DNS resolution logic in SocketsHttpHandler.

@davidsh
Copy link
Contributor

davidsh commented Oct 22, 2018

I was able to research this problem with a Windows-Windows setup in our separate Enterprise Testing environment.

Given an IIS server called "corefx-net-iis" on a domain called "corefx-net.contoso.com", we are able to get Negotiate to use Kerberos with using any of the following URI's.

// Use A record of server
string server = "http://corefx-net-iis/test/NegotiateTest.ashx";
string server = "http://corefx-net-iis.corefx-net.contoso.com/test/NegotiateTest.ashx";

// Use CNAME of server
string server = "http://iis-server/test/NegotiateTest.ashx";
string server = "http://iis-server.corefx-net.contoso.com/test/NegotiateTest.ashx";

"iis-server.corefx-net.contoso.com" is a CNAME.

But for .NET Core 2.1.5, Negotiate will only use Kerberos when using the original FQDN of the server (A record):

string server = "http://corefx-net-iis/test/NegotiateTest.ashx";
string server = "http://corefx-net-iis.corefx-net.contoso.com/test/NegotiateTest.ashx";

Any of the DNS names using the CNAME results in Negotiate using NTLM.

@CJHarmath
Copy link
Author

Thanks @davidsh for your research and the effort to reproduce this issue!
(Hopefully your setup could later be used to run Kerberos integration tests!)

If NTLM is disabled due to security considerations(which can be the case in sensitive environments), then calls with CName on .NET Core 2.1.5 won't be able to authenticate and fail, so that can be a good test for the fix.

Here is how to disable NTLM:
https://docs.microsoft.com/en-us/windows/security/threat-protection/security-policy-settings/network-security-restrict-ntlm-ntlm-authentication-in-this-domain#security-considerations

Or if NTLM is not supported by the target server running on Linux in a trusted Kerberos realm then again the auth will fail ( my original use-case, which is much more involved to have a lab for).

@davidsh
Copy link
Contributor

davidsh commented Oct 24, 2018

Linux
On Linux this is configured via krb5.conf so SocketsHTTPHandler should honor that setting.
rdns = true|false

Just to be clear on that particular Linux kerberos setting, it only affects whether REVERSE DNS is done on ip addresses. It doesn't change how FORWARD normalization is done with respect to traversing CNAME records.

Service principal canonicalization
MIT Kerberos clients currently always do forward resolution (looking up the IPv4 and possibly IPv6
addresses using getaddrinfo()) of the hostname part of a host-based service principal to canonicalize the
hostname. They obtain the “canonical” name of the host when doing so. By default, MIT Kerberos clients
will also then do reverse DNS resolution (looking up the hostname associated with the IPv4 or IPv6
address using getnameinfo()) of the hostname. Using the krb5.conf setting:

[libdefaults]
    rdns = false
will disable reverse DNS lookup on clients. The default setting is “true”.

Operating system bugs may prevent a setting of rdns = false from disabling reverse DNS lookup. Some
versions of GNU libc have a bug in getaddrinfo() that cause them to look up PTR records even when not
required. MIT Kerberos releases krb5-1.10.2 and newer have a workaround for this problem, as does the
krb5-1.9.x series as of release krb5-1.9.4.

Reverse DNS mismatches
Sometimes, an enterprise will have control over its forward DNS but not its reverse DNS. The reverse
DNS is sometimes under the control of the Internet service provider of the enterprise, and the enterprise
may not have much influence in setting up reverse DNS records for its address space. If there are
difficulties with getting forward and reverse DNS to match, it is best to set rdns = false on client
machines.

.NET Framework has never done any reverse DNS lookup checks. And in general, it won't do any normalization of the SPN name (from the hostname in the Uri specified in the http request) if the hostname is actually an IP address, i.e. "http://10.0.0.5/NegotiateEndpoint"

I am working on a fix for this issue. But the PR will likely only match existing .NET Framework behavior and won't do any reverse DNS checks nor any specific Linux kerberos config file lookups.

@CJHarmath
Copy link
Author

Just to be clear on that particular Linux kerberos setting, it only affects whether REVERSE DNS is done on ip addresses. It doesn't change how FORWARD normalization is done with respect to traversing CNAME records.

Thanks for pointing this out! The correct setting is actually dns_canonicalize_hostname which is a "relatively" recent ( khmm 5y old, but that's recent in Krb) addition.
krb5/krb5@60edb32
It's default value is true.

Indicate whether name lookups will be used to canonicalize hostnames for use in service principal names. Setting this flag to false can improve security by reducing reliance on DNS, but means that short hostnames will not be canonicalized to fully-qualified hostnames. The default value is true.

https://web.mit.edu/kerberos/krb5-1.12/doc/admin/conf_files/krb5_conf.html#libdefaults

So a correct implementation would need to honor that setting on Linux and if it's not then at least it should be mentioned somewhere so it's clear and avoids confusions ( probably saving few hours of debugging ).

Just to be clear, I am already a happy person with your proposed fix of having the existing .NET framework behavior, but for the sake of completeness wanted to mention that Linux krb5.conf knob.

Thanks!

@mattpwhite
Copy link
Contributor

I am working on a fix for this issue. But the PR will likely only match existing .NET Framework behavior and won't do any reverse DNS checks nor any specific Linux kerberos config file lookups.

Well, under no circumstances should .NET be trying to parse a krb5.conf. On non-Windows systems, my understanding is that no name canonicalization should need to be performed before calling into a GSSAPI/Kerberos implementation because that library already takes care of this (or not, depending on how someone chose to configure it for their environment). My understanding is also that the typical configuration is forward canonicalization on, reverse off. Forward canonicalization was just implicitly enabled by MIT for some time, though as @csharmath points out, it's now a knob.

The Windows case is different because SSPI does not do canonicalization on behalf of applications. There is no way for a developer/administrator to express how they would like all applications to behave, so the most reasonable thing to do is to just do what IE did way back when and the other browsers subsequently emulated - forward canoncialize, no reverse. FWIW, browsers did eventually add knobs to customize this behavior on Windows (https://www.chromium.org/developers/design-documents/http-authentication, https://blogs.technet.microsoft.com/askds/2009/06/22/internet-explorer-behaviors-with-kerberos-authentication/). The reason these are application knobs on Windows is because the application actually controls it; disabling CNAME resolution wouldn't work if SSPI did it for you in the way that MIT does in a default configuration.

@CJHarmath
Copy link
Author

Well, under no circumstances should .NET be trying to parse a krb5.conf.
..
no name canonicalization should need to be performed before calling into a GSSAPI/Kerberos implementation because that library already takes care of this

Make sense, thanks for this!

@davidsh
Copy link
Contributor

davidsh commented Oct 24, 2018

so the most reasonable thing to do is to just do what IE did way back when and the other browsers subsequently emulated - forward canoncialize, no reverse

This is what the current .NET Framework behavior is. And this is what the fix for .NET Core will be also.

@davidsh
Copy link
Contributor

davidsh commented Oct 25, 2018

fyi. I'll be OOF for about a week+. So, I'll be submitting the PR for this fix as soon as I get back.

davidsh referenced this issue in dotnet/corefx Nov 12, 2018
SocketsHttpHandler was not normalizing the DNS name prior to using it for the SPN
(Service Principal Name). So, when using URI's that involve a CNAME, it was using
the CNAME directly and not evaluating it to the normalized FQDN A record of the host.

This change fixes the behavior to match .NET Framework so that CNAMEs are resolved
properly. We can use the standard Dns.GetHostEntryAsync() API to resolve the name.

From a performance perspective, this additional DNS API call is limited to just
the SPN calculation for NT Auth. Calling this API doesn't impact the performance on the
wire since the OS will cache DNS calls.  Wireshark confirms that no additional DNS
protocol packets will be sent.

.NET Framework actually caches the normalized DNS resolution on the ServicePoint object
when it opens up a connections. Thus, it doesn't have to call Dns.GetHostEntryAsync()
for the SPN calculation. While a future PR could further optimize SocketsHttpHandler to
also cache this DNS host name, it isn't clear it would result in measurable performance gain.

I tested this change in a separate Enterprise testing environment I set up. I created
a CNAME for a Windows IIS server in a Windows domain-joined environment and demonstrated that
the Negotiate protocol results in a Kerberos authentication (and doesn't fall back to NTLM).

Fixes #32328
@CJHarmath
Copy link
Author

thanks for the fix @davidsh !
So this will ship with 3.0 ?

@davidsh
Copy link
Contributor

davidsh commented Nov 13, 2018

So this will ship with 3.0 ?

Yes, the fix is in the master branch for 3.0.

@CJHarmath
Copy link
Author

Thank you for your fix @davidsh !

Just wondering if it would be possible to backport this merge request to either 2.2 or one of it's servicing releases so this fix become available for 2.2 as well and for PowerShell Core 6.2 ?

@karelz
Copy link
Member

karelz commented Nov 14, 2018

@csharmath we do not port changes to servicing branches unless there is a very good reason - i.e. impact on larger set of customers, without reasonable workaround. Is that the case here?

@CJHarmath
Copy link
Author

It's hard for me to tell the impact of this, but surely impacts all shops using Negotiate with CNames.
Plus this is a breaking change for Negotiate auth introduced with SocketsHttpHandler requiring a workaround to disable it and loosing on perf.

The workaround is to either use the DNS A record or disable SocketsHttpHandler (more preferable in cases when CNames can change).

AppContext.SetSwitch("System.Net.Http.UseSocketsHttpHandler", false);

or set the env var

$env:DOTNET_SYSTEM_NET_HTTP_USESOCKETSHTTPHANDLER=0

Once set, these settings have a potential to be easily forgotten to be undone by dev teams after moving to 3.0 and missing out on the new perf improvements (unless these settings will be ignored with 3.0).

@karelz
Copy link
Member

karelz commented Nov 14, 2018

@csharmath correct, disabling SockertsHttpHandler is not something we recommend. However, porting every fix into servicing would basically make the servicing branch new master (incl. instability, lower quality, higher chance of other regressions, etc.)
If we hear from more customers, we can consider porting it back.

Does it have specific impact on your environment(s)? Is the first workaround reasonable / acceptable in the meantime for you?

@CJHarmath
Copy link
Author

To be clear I wasn't trying to propose to port all fixes, but I would consider anything security related as important to evaluate for backporting consideration.

The impact of this issue could be a downgrade from Kerberos to NTLM.
If NTLM is configured and allowed, then it's very unlikely that most of your customers will notice this unless they explicitly audit NTLM ( which we do) as it's still authenticating, but not with the recommended and desired mechanism.
If the webserver is running on Linux or in an environment where NTLM is explicitly disabled, then it breaks auth so those customers will notice it immediately.

If a company already made investments to setup and use Kerberos, but for reasons did not completely disable NTLM then this change weakens their security with them potentially not even realizing it.

If you put your security hat on, then it's a less optimal situation.
So my angle on this is security and would recommend to review your position based on that.

Let me paste from MSDN:

NTLM and NTLMv2 authentication is vulnerable to a variety of malicious attacks, including SMB replay, man-in-the-middle attacks, and brute force attacks. Reducing and eliminating NTLM authentication from your environment forces the Windows operating system to use more secure protocols, such as the Kerberos version 5 protocol, or different authentication mechanisms, such as smart cards.

https://docs.microsoft.com/en-us/windows/security/threat-protection/security-policy-settings/network-security-restrict-ntlm-ntlm-authentication-in-this-domain#security-considerations

@davidsh
Copy link
Contributor

davidsh commented Nov 14, 2018

There is another workaround/mitigation that can be considered.

This problem only occurs if a CNAME is used in the URI and that CNAME is not registered as an SPN in Kerberos. So, the workaround is to register this additional CNAME SPN in Windows Active Directory / Kerberos environment.

@NigelWhatling
Copy link

@csharmath correct, disabling SockertsHttpHandler is not something we recommend....
If we hear from more customers, we can consider porting it back.

I will be forced to deploy the current project I am working on with SocketsHttpHandler disabled. I'm delivering a Web API that needs to pass through delegated auth to a SAP OData service.

jlennox referenced this issue in jlennox/corefx Dec 16, 2018
SocketsHttpHandler was not normalizing the DNS name prior to using it for the SPN
(Service Principal Name). So, when using URI's that involve a CNAME, it was using
the CNAME directly and not evaluating it to the normalized FQDN A record of the host.

This change fixes the behavior to match .NET Framework so that CNAMEs are resolved
properly. We can use the standard Dns.GetHostEntryAsync() API to resolve the name.

From a performance perspective, this additional DNS API call is limited to just
the SPN calculation for NT Auth. Calling this API doesn't impact the performance on the
wire since the OS will cache DNS calls.  Wireshark confirms that no additional DNS
protocol packets will be sent.

.NET Framework actually caches the normalized DNS resolution on the ServicePoint object
when it opens up a connections. Thus, it doesn't have to call Dns.GetHostEntryAsync()
for the SPN calculation. While a future PR could further optimize SocketsHttpHandler to
also cache this DNS host name, it isn't clear it would result in measurable performance gain.

I tested this change in a separate Enterprise testing environment I set up. I created
a CNAME for a Windows IIS server in a Windows domain-joined environment and demonstrated that
the Negotiate protocol results in a Kerberos authentication (and doesn't fall back to NTLM).

Fixes #32328
@NigelWhatling
Copy link

@csharmath correct, disabling SockertsHttpHandler is not something we recommend....
If we hear from more customers, we can consider porting it back.

I will be forced to deploy the current project I am working on with SocketsHttpHandler disabled. I'm delivering a Web API that needs to pass through delegated auth to a SAP OData service.

Hit this issue in another project for another client. Forced to disable SocketsHttpHandler. :/

@karelz
Copy link
Member

karelz commented Jan 3, 2020

So practically, 1-2 affected projects per year so far.

Did you consider updating to .NET Core 3.0 or 3.1 (if you prefer LTS versions)? It is fixed there.

@NigelWhatling
Copy link

Fair enough. Small numbers. For me, that represents the last two projects I've worked on.

I'm not making the call on version for this one. And the workaround is easy enough once you finally figure out (remember) where the problem lies. Just seems weird to leave the bug there.

@karelz
Copy link
Member

karelz commented Jan 3, 2020

Isn't every bug weird to have? That is their definitions - they are bugs, unexpected behaviors. Fixing is usually done based on wide-spread impact.

You can use existence of this bug as reason to upgrade to newer version - take it up with decision makers. If they don't care ... the bug is likely not such high priority for them, or get info from them why they cannot upgrade.

@NigelWhatling
Copy link

I guess it's partly frustration on my part. Both projects have burnt many hours over days going back and forth with infrastructure people to try to get the magic soup of SPNs, etc right for Kerberos to work in their environment. Something that certainly isn't my specialty and in these cases don't have the level of access to directly tinker with myself.

Bugs like this one are fun because they don't (to me, at least) immediately point to the code. Authentication just doesn't work. So you go back and beg people to triple-check SPNs and firewalls and who-knows-what-else to figure out what part of the environment is not configured right. Until someone finally ran across this issue in GitHub, it never occurred to me that there might be a new network handler in play (by default) that just didn't do the same thing that the old one did.

Anyway, the bug has been dealt with. It just requires upgrading. Or a workaround if that is not feasible.

@msftgits msftgits transferred this issue from dotnet/corefx Jan 31, 2020
@msftgits msftgits added this to the 3.0 milestone Jan 31, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 15, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug tenet-compatibility Incompatibility with previous versions or .NET Framework
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants