Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting Environment Variables is few times slower on Linux #866

Closed
adamsitnik opened this issue Oct 30, 2019 · 14 comments
Closed

Getting Environment Variables is few times slower on Linux #866

adamsitnik opened this issue Oct 30, 2019 · 14 comments
Labels
area-PAL-coreclr tenet-performance Performance related issue
Milestone

Comments

@adamsitnik
Copy link
Member

Slower Lin/Win Win Median (ns) Lin Median (ns) Modality
System.Tests.Perf_Environment.ExpandEnvironmentVariables 4.06 155.25 630.15
System.Tests.Perf_Environment.GetEnvironmentVariable 3.94 139.77 550.89
System.Tests.Perf_Environment.GetEnvironmentVariables 1.56 15955.73 24816.14 several?

How to run the benchmarks:

git clone https://github.com/dotnet/performance.git
python3 ./performance/scripts/benchmarks_ci.py -f netcoreapp5.0 --filter 'Perf_Environment'

Recommended profilers are PerfCollect and VTune.

@adamsitnik
Copy link
Member Author

I've taken a brief look at Environment.GetEnvironmentVariable and it looks like most of the time is spent for doing text operations: getting length of the key (which is known up-front), converting Unicode to Utf8 and then searching for the key in the array:

image

image

@Ulisse67
Copy link
Contributor

Ulisse67 commented Oct 30, 2019

I think, as a quick fix, that the check by length could be scrapped and just strcmp() could be used, if it doesn't use strlen() internally :-)

Oh oh, I got now to profile the code and I see why strncmp is needed (because of = in env entries).

@adamsitnik
Copy link
Member Author

dotnet/corefx#27632 did not solve the problem

image

@Ulisse67
Copy link
Contributor

Ulisse67 commented Nov 5, 2019

I've tried to implement strncmp as an explicit loop:

int search_pointers(const char* name, const char** palEnvironment)
{
	if (name[0] == '\0')
		return -1;

	for (int i = 0; palEnvironment[i] != nullptr; ++i) {
		char ch;
		const char* pch = name;
		const char* p = palEnvironment[i];

		while ((ch = *pch++) && ch == *p++)
			;

		if (ch == '\0') return i;
	}

	return -1;
}

and the profiler shows a good improvement (based on 1E8 iterations over a list of 6 entries looking for the last entry):

image

@Ulisse67
Copy link
Contributor

Ulisse67 commented Nov 6, 2019

I've been able to shave off some percent:

image

by rewriting the inner tests of the compare loop:

int search_pointers3(const char* name, const char** palEnvironment)
{
	if (name[0] == '\0')
		return -1;

	for (int i = 0; palEnvironment[i] != nullptr; ++i) {
		const char* pch = name;
		const char* p = palEnvironment[i];

		for (;;) {
			if (*pch == '\0') return i;

			if (*pch++ != *p++) break;
		}
	}

	return -1;
}

@adamsitnik
Copy link
Member Author

adamsitnik commented Nov 6, 2019

@Ulisse67 it would be great if you could run our benchmarks and confirm the improvements

How to build CoreCLR in Release:

./build.sh -release
./build-test.sh release generatelayoutonly

Once you run those two commands you should get an executable called corerun which is a simplified .NET Core exe runner. You need to use it to run the benchmarks.

Example:

git clone https://github.com/dotnet/performance.git
python3 ./performance/scripts/benchmarks_ci.py -f netcoreapp5.0 --filter 'Perf_Environment' --corerun /home/adam/projects/coreclr/bin/tests/Linux.x64.Release/Tests/Core_Root/corerun

@jkotas
Copy link
Member

jkotas commented Nov 6, 2019

It may be also worth looking at inlining check for the first character only that filters most of the negatives, and still call strncmp for the rest.

@Ulisse67
Copy link
Contributor

Ulisse67 commented Nov 7, 2019

It may be also worth looking at inlining check for the first character only that filters most of the negatives, and still call strncmp for the rest.

It makes a difference, indeed:

image

I'm going to build and run the official benchmarks.

@Ulisse67
Copy link
Contributor

I had finally the time to build CoreCLR on my machine (WSL1 within Win10) and have run the performance tests for a version with strncmp + lookahead(1):

performance-strncmp-lookahead

And then for a version using pointers:

performance-pointers-static

static char* match_prefix(const char* name, char** list)
{
	if(*name == '\0')
		return nullptr;
	
	for (int i = 0; list[i] != nullptr; ++i)
	{
		const char* pch = name;
		char* p = list[i];

		for (;;) {
			if (*pch == '\0') {
				if(*p == '=')
					return p+1;
					
				if(*p == '\0') // no = sign -> empty value
					return p;
				
				break;
			}
			
			if (*pch++ != *p++) break;
		}
	}
	
	return nullptr;
}
	
char* EnvironGetenv(const char* name, BOOL copyValue)
{
	CPalThread * pthrCurrent = InternalGetCurrentThread();
	InternalEnterCriticalSection(pthrCurrent, &gcsEnvironment);
	
	char* retValue = match_prefix(name, palEnvironment);
	
	if ((retValue != nullptr) && copyValue)
	{
		retValue = strdup(retValue);
	}

	InternalLeaveCriticalSection(pthrCurrent, &gcsEnvironment);
	return retValue;
}

@adamsitnik
Copy link
Member Author

It looks like you got 10% improvement, nice!

@Ulisse67
Copy link
Contributor

Do I have to make a PR?

@adamsitnik
Copy link
Member Author

Do I have to make a PR?

Yes, please!

We have merged CoreFX and CoreCLR repositories into https://github.com/dotnet/runtime, so please send your PR to the new repo

@danmoseley
Copy link
Member

Can environment variable names or values in Linux ever be non ASCII (ie., UTF-8)? If so will this "raw pointer" approach still be OK? (eg., will it see = that is somehow a second or subsequent byte of a codepoint)

@Ulisse67
Copy link
Contributor

I believe the equal sign has one 8-bit code even in UTF-8; there are two other equal signs (above= and under=) which use 2 bytes each, but I don't think is legal for specifyfing an environment variable.

Even using UTF-8 for keys would not invalidate the pointer (or strncmp()) approach, because UTF-8 is defined to be backwards compatible with zero-terminated strings, and thus all related idioms keep working.

@danmoseley danmoseley transferred this issue from dotnet/corefx Dec 14, 2019
@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added the untriaged New issue has not been triaged by the area owner label Dec 14, 2019
@jeffschwMSFT jeffschwMSFT added this to the Future milestone Jan 8, 2020
@jeffschwMSFT jeffschwMSFT removed the untriaged New issue has not been triaged by the area owner label Jan 8, 2020
@jkotas jkotas closed this as completed in 8e91564 Feb 17, 2020
@adamsitnik adamsitnik added the tenet-performance Performance related issue label Aug 19, 2020
@adamsitnik adamsitnik modified the milestones: Future, 5.0.0 Aug 19, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 11, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-PAL-coreclr tenet-performance Performance related issue
Projects
None yet
Development

No branches or pull requests

6 participants