Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MemoryDiagnoser with Parallel.ForEach #1142

Closed
alimozdemir opened this issue May 1, 2019 · 1 comment
Closed

MemoryDiagnoser with Parallel.ForEach #1142

alimozdemir opened this issue May 1, 2019 · 1 comment

Comments

@alimozdemir
Copy link

Hey everyone, I have faced with an issue. I don't know, is that a bug or am I doing something wrong. I have trying to benchmark a parallel file reading with Parallel.ForEach and File.ReadLines("path"), but the memory part (Allocated, GCs) of the benchmark result is very weird. I have prepared a demo project.

        [Benchmark]
        public void ParallelTest()
        {
            var counter = new ConcurrentDictionary<string, float> ();

            Parallel.ForEach(File.ReadLines("data.csv"), (line) => {
                ProcessLineParallel(counter, line);
            });

            if (counter.Count != 1000)
                throw new Exception("Not expected count of persons.");
        }

        [Benchmark]
        public void NormalTest()
        {
            var counter = new Dictionary<string, float> ();
            var allLines = File.ReadAllLines("data.csv");
            foreach (var line in allLines)
            {
                ProcessLine(counter, line);
            }

            if (counter.Count != 1000)
                throw new Exception("Not expected count of persons.");
        }

https://github.com/lyzerk/parallelFileReadingBenchmark/blob/adc7276ff40310489e8a01fb2773981d7b42f9f5/FileReaderTester.cs#L13-L38

ProcessLine and ProcessLineParallel almost same. Only difference is ConcurrentDictionary and couple lines.

And the result is;

BenchmarkDotNet=v0.11.5, OS=macOS Mojave 10.14.4 (18E226) [Darwin 18.5.0]
Intel Core i5-7360U CPU 2.30GHz (Kaby Lake), 1 CPU, 4 logical and 2 physical cores
.NET Core SDK=3.0.100-preview3-010431
  [Host]     : .NET Core 3.0.0-preview3-27503-5 (CoreCLR 4.6.27422.72, CoreFX 4.7.19.12807), 64bit RyuJIT
  DefaultJob : .NET Core 3.0.0-preview3-27503-5 (CoreCLR 4.6.27422.72, CoreFX 4.7.19.12807), 64bit RyuJIT

Method Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
ParallelTest 1.116 ms 0.0093 ms 0.0087 ms 378.9063 146.4844 - 107.03 KB
NormalTest 1.028 ms 0.0121 ms 0.0113 ms 287.1094 93.7500 - 563.03 KB

The weird thing is allocated column. I haven't do any memory optimization so why is it decreased so much. You can say it seems ok but my real project's result is show the difference. (with bigger file)

BenchmarkDotNet=v0.11.5, OS=macOS Mojave 10.14.4 (18E226) [Darwin 18.5.0]
Intel Core i5-7360U CPU 2.30GHz (Kaby Lake), 1 CPU, 4 logical and 2 physical cores
.NET Core SDK=3.0.100-preview3-010431
  [Host]     : .NET Core 3.0.0-preview3-27503-5 (CoreCLR 4.6.27422.72, CoreFX 4.7.19.12807), 64bit RyuJIT
  DefaultJob : .NET Core 3.0.0-preview3-27503-5 (CoreCLR 4.6.27422.72, CoreFX 4.7.19.12807), 64bit RyuJIT

Method Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
ReadAndSumSliceParallel 3.070 s 0.0609 s 0.1257 s 171000.0000 53000.0000 9000.0000 3.95 MB
ReadAndSumSliceSync 4.881 s 0.0358 s 0.0335 s 83000.0000 31000.0000 4000.0000 480.59 MB

~4MB against ~481MB, Am I doing something wrong, is this an issue or is this what should we expect ?

@adamsitnik
Copy link
Member

Hello @lyzerk

To tell the long story short .NET Core exposes only System.GC.GetAllocatedBytesForCurrentThread() method and it does not implement AppDomain.CurrentDomain.MonitoringTotalAllocatedMemorySize as Full Framework does. So we currently don't have an option to do it properly for multi threaded benchmarks for .NET Core.

A workaround is to run the benchmark for Full Framework, but it requires Windows and I can see that you are running on MacOS so it won't help ;/

For now the only thing I can recommend is to look at the numbers of GC Collects in Gen 0/1/2 to compare two multithreaded benchmarks.

We already have an issue for that: #723, so I am going to close this one as duplicate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants