Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Benchmark CLI for format comparison and perf tracking #329

Merged
merged 15 commits into from
Oct 16, 2023

Conversation

nirosys
Copy link
Contributor

@nirosys nirosys commented Sep 6, 2023

Issue #, if available:

Description of changes:

Changes to the Build

Google Test

This PR changes the googletest usage a little. This was primarily to support google-benchmark's googletest dependency, but resulted in further adjustments to supply users with the ability to:

  • Bring their own google test version without colliding with ion-c's version. Just import googletest however is needed so that the gtest_main target is available, and the ion-c build will disable its own import.
  • Disable ion-c unit tests from building. Turn the IONC_BUILD_TEST flag off in order to stop the ion-c build from building ion-c unit tests.

Build Type: Profiling

This PR's primary purpose is to add a new CLI to allow for benchmarking for performance baselining, analysis, and improvement quantification. In order to produce builds where we have optimized builds, but still have enough debug information to generate flame graphs, and use profilers effectively a new build type was added to do just that. Rather than building with -DCMAKE_BUILD_TYPE=Release, a user can now use -DCMAKE_BUILD_TYPE=Profiling and cmake will produce a build that has debug information and optimization passes.

Addition of IonCBench

The primary purpose of this PR is to add the ion-c benchmark CLI. In order to build this tool the user can set the CMake variable IONC_BENCHMARKING_ENABLED to ON. This is done by default when using the Profiling build type. The tool is intended to provide a similar set of features to the other benchmark CLIs found for ion-java, and ion-python.

Currently, the tool allows the user to provide their own data, and perform both full deserialization and full serialization of that data using ion-c (in text mode, or binary), MsgPack (MsgPack-C), JSON (yyjson, and json-c), and CBOR (libcbor).

This PR is the first iteration of the ion-bench tool, and provides functionality for measuring CPU time, and other CPU metrics such as instruction counts (as long as it is built with libpfm). The tool currently supports a single benchmark run per invocation. Each benchmark run requires an input dataset, a specified supported implementation, and the requested benchmark to run. Currently the tool supports two benchmarks:

  • deserialize_all - Where all data is read, and materialized, from the input dataset.
  • serialize_all - Where a flattened representation of the data is read from the input dataset, and re-written using the same format. The timing information is only for the write of the dataset.

In both benchmarks timing does not include the IO to get the input dataset into memory.

The expectation is that more benchmarks will be added, along with more format implementations, in order to compare runtime, memory usage, and data size, between ion and other data formats.

Usage

The tool has a --help which has a list of all of the arguments that can be used:

# tools/ion-bench/src/IonCBench --help
Usage: tools/ion-bench/src/IonCBench
  --help                    Display this help and exit.
  -L, --list-libs           List available libraries to benchmark.
  -B, --list-bench          List available benchmarks.
  -n, --name=<string>       Name to use for the run in reporting
  -b, --benchmark=<string>  Benchmark to run. (read or write)
  -d, --dataset=FILE        Add a dataset to run benchmark with.
  -l, --library=<string>    Library to use (use -L to see a list of supported libraries)
  --no-stats                Do not generate benchmarks stats. (Used primarily for profiling)
  -p, --pretty-print        Pretty print text output

By default, output will be presented in a tabular format.

# tools/ion-bench/src/IonCBench -b deserialize_all -d ../../tools/ion-bench/data/service_log_legacy/service_log_legacy.10n -l ion-c-binary -n "Ion Binary"
Benchmark: deserialize_all
2023-10-12T21:44:24+00:00
Running tools/ion-bench/src/IonCBench
Run on (6 X 2592.01 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x6)
  L1 Instruction 32 KiB (x6)
  L2 Unified 256 KiB (x6)
  L3 Unified 12288 KiB (x1)
Load Average: 0.11, 0.35, 0.26
---------------------------------------------------------------------------------
Benchmark                       Time             CPU   Iterations UserCounters...
---------------------------------------------------------------------------------
service_log_legacy.10n 1071590601 ns   1071241019 ns            1 Bps=20.7825M/s bools=0 nulls=0 nums=2.73713M objs=880.411k strs=69.818k

Since the tool leans very heavily on google-benchmark, all google-benchmark arguments are also supported:

# tools/ion-bench/src/IonCBench -b deserialize_all -d ../../tools/ion-bench/data/service_log_legacy/service_log_legacy.10n -l ion-c-binary -n "Ion Binary" --benchmark_format=json
Benchmark: deserialize_all
{
  "context": {
    "date": "2023-10-12T21:51:55+00:00",
    "host_name": "663ce7a9e233",
    "executable": "tools/ion-bench/src/IonCBench",
    "num_cpus": 6,
    "mhz_per_cpu": 2592,
    "cpu_scaling_enabled": false,
    "caches": [
      {
        "type": "Data",
        "level": 1,
        "size": 32768,
        "num_sharing": 1
      },
      {
        "type": "Instruction",
        "level": 1,
        "size": 32768,
        "num_sharing": 1
      },
      {
        "type": "Unified",
        "level": 2,
        "size": 262144,
        "num_sharing": 1
      },
      {
        "type": "Unified",
        "level": 3,
        "size": 12582912,
        "num_sharing": 6
      }
    ],
    "load_avg": [0.198242,0.136719,0.174316],
    "library_build_type": "release"
  },
  "benchmarks": [
    {
      "name": "service_log_legacy.10n",
      "family_index": 0,
      "per_family_instance_index": 0,
      "run_name": "service_log_legacy.10n",
      "run_type": "iteration",
      "repetitions": 1,
      "repetition_index": 0,
      "threads": 1,
      "iterations": 1,
      "real_time": 1.0773163559999831e+09,
      "cpu_time": 1.0769988449999998e+09,
      "time_unit": "ns",
      "Bps": 2.0671366643851880e+07,
      "bools": 0.0000000000000000e+00,
      "nulls": 0.0000000000000000e+00,
      "nums": 2.7371320000000000e+06,
      "objs": 8.8041100000000000e+05,
      "strs": 6.9818000000000000e+04
    }
  ]
}

Changes Since Original Post

  • Updated GHA workflows to pull the last 50 commits with tags so version.h can be generated with the current implementation. (Thinking about following up with a change to the version.h generation so we don't have to do this, and risk losing the version if too many commits get added)
  • Updated Build and Test workflows to be more act friendly. This meant re-working the matrices so that they can be simple values rather than objects. Matrix values are only matched on the top level, so running in act would mean providing the full object each run, which can be error prone. I standardized the builds on two keys, image and toolchain. image defines the container, or VM Image (runs-on), that the job uses, and toolchain defines whether to use gcc, or clang (currently).
  • Fixed a memleak that was reported by the ubuntu/clang build that could occur when closing or resetting a reader.
  • Fixed an uninitialized data issue in one of the ion_decimal unit tests.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@nirosys
Copy link
Contributor Author

nirosys commented Sep 7, 2023

Build checks failed due a couple issues. One where gtest isn't linking, which doesn't reproduce on my system. Another is cmake acting like it hasn't known what C++17 is since 3.8.. continuing to dig.

@nirosys
Copy link
Contributor Author

nirosys commented Sep 28, 2023

The unit test crash that is occurring with the amazonlinux:1 gcc72 build isn't reliably reproducible locally. The most recent commit should address the memleak that was identified by the ubuntu clang build, but I'm not sure if it is related. Will continue digging if not.

@nirosys
Copy link
Contributor Author

nirosys commented Sep 29, 2023

Ok. I've reproduced the crash that GHA is seeing, and it's pretty awesome. /s

All of my attempts to reproduce the issue failed. I combed through the package versions, and made sure the docker digest SHA reported by GitHub matched the image I was using locally. Everything lined up but for some reason the issue would not reproduce. Until I piped the output of act, into tee so I could grep through the output to make sure there wasn't any pointers in the haystack of warnings ion-c produces. As soon as the job ran with this new pipe, it crashed with the same error GHA produces.

I was also able to reproduce the issue by simply redirecting the output of the unit tests into a file, which made it possible to easily run it under gdb. I had known from the output of the GHA crash that the issue was triggering in the test_ion_decimal.cpp tests, specifically WriteAllValues, but had no reason to see an issue. Debugging without the redirect showed the values within the function to be sane, and no error occurred during the free.

With the redirect, I'm guessing some things have shifted on the stack, and the uninitialized ion_decimal defined within the WriteAllValues test ends up lining up with data that makes the ION_DECIMAL's type field contain the value for ION_DECIMAL_TYPE_NUMBER. This results in the only code path that tries to free the decimal's value.num_value buffer, which is also initialized with random stack data and cannot be free'd.

@nirosys
Copy link
Contributor Author

nirosys commented Oct 11, 2023

Putting this into review. There may be an issue with ion-test-drivers, I'm still looking into that, but this PR should be good to start moving forward.

@nirosys nirosys marked this pull request as ready for review October 11, 2023 16:34
@nirosys nirosys requested a review from tgregg October 11, 2023 18:18
tgregg
tgregg previously approved these changes Oct 11, 2023
Copy link
Contributor

@tgregg tgregg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Could you add some example output from the tool to the PR description?

test/CMakeLists.txt Outdated Show resolved Hide resolved
@@ -1,7 +1,21 @@
[submodule "ion-tests"]
path = ion-tests
url = https://github.com/amazon-ion/ion-tests.git
[submodule "googletest"]
path = test/googletest
[submodule "tools/ion-bench/deps/libcbor"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there something we can do to make it easy for people to avoid pulling down these dependencies unless they plan to build the benchmark CLI?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, not sure what we can do here. I might be able to wire it into the cmake config. I'll dig into that, and see what options we have.

tools/ion-bench/src/memory.c Outdated Show resolved Hide resolved
Comment on lines +101 to +107
ion_reader_get_annotation_count(_reader, &count);
ION_STRING *syms = new ION_STRING[count];
ion_reader_get_annotations(_reader, syms, count, &count);
for (int i=0; i < count; i++) {
annot.push_back(std::string((char *)syms[i].value, syms[i].length));
}
delete[] syms;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use ion_reader_get_an_annotation in the loop and avoid copying into a temporary array?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. I'll follow up with a PR to change that.

Comment on lines +229 to +231
default:
printf("Attempt to write unknown type: %d\n", val.tpe);
break;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we missing support for null.null, blob, and clob?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agh, yea. I'll include those in the follow-up PR.

break;
}
case YYJSON_TYPE_BOOL:
stats.num_bools++;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we retrieve the value of the boolean?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I need to wire up a dependency on the bool so it doesn't get optimized out. For yyjson specifically getting a bool is a memory access, and a couple bit operations, so there aren't any side effects to keep the call around. I'll address it in the follow-up.

tools/ion-bench/src/main.cpp Outdated Show resolved Hide resolved
@nirosys
Copy link
Contributor Author

nirosys commented Oct 12, 2023

Thank you! I'm going to push up a new commit with the commented code, and typos, fixed, and output added. Then get I'll this merged. Then PR the regression workflow, and follow up with the code changes discussed above. Unless anyone has an argument against that.

@nirosys nirosys merged commit cf747d3 into amazon-ion:master Oct 16, 2023
11 of 13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants