-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: testing/cmp: add new package #45200
Comments
I wonder if the package path I generally agree with this proposal; the Also, have you thought about size? In terms of API and packages (there's also cmpopts), but also when weighing the size of Go in its source and binary forms. A more powerful |
As a big fan of the cmp package, this seems like a good idea to me. The API is stable and well tested. One thought: with generics only a year or so away, I wonder if we should hold on until we can make the signatures generic:
Although the implementation would still be using reflection under the hood, using a type parameter The current behaviour would still be available by using I'd also support folding the cmpopts package into the root |
When |
Nope. Thanks for raising up the thought. There's some files and functionality that can be removed I believe:
|
Interesting thought. I should also note that generics would help with:
Yea, that's been one of my longer term goals. The main reason I haven't done so yet is because adding them in causes the godoc page to be less readable, which is another motivating reason for #44447. |
I'd be happy for github.com/pkg/diff to morph into a useful diff package for std. I'd rather it be in golang.org/x, so that it can be used by all and sundry. (I'm frustrated by various gems locked up in std internal.) I cannot dedicate much time to it in the near future, but would be happy to be involved in API design (and contribute the existing code as a starting point). For that, it's probably time for someone (not me) to file a proposal where we can start discussing what we want out of it. |
@josharian I've been doing various diff investigations off and on for a few years (sic) and had some time last week to hook a bunch of stuff together. I have rather a lot of thoughts, and an implementation of the the O(N)-space Myers algorithm, among others. I completely agree that we should have a standard diff package separate from testing/cmp. I need to take care of a few higher priority things but I intend to write more next month. I've retitled #23113 to make that clear. |
@dsnet My main concern with adding testing/cmp is that it is a very large amount of code and API, especially compared to package testing itself. Just the sheer number of internal packages you listed in #45200 (comment) scares me. Is it possible to take this opportunity to simplify at all? |
@rsc works for me. I've been thinking about diffs on and off too. :) I look forward to chatting about it when you are ready. |
@rsc, my main point from #45200 (comment) was that most of those internal packages can be eliminated since the needed functionality has either been added to the standard library already (e.g., |
@rsc. If it's any consolation, the logic that's concerned with the semantics of whether two values are equal is around ~900 LOC. The remaining complexity comes from the logic to pretty print the difference, which accounts for ~1800 LOC. Most of that complexity is because the the reporter is a relatively large set of heuristics for how to present the difference in a way that is easiest for humans to interpret. The reporter logic was fine tuned over the past years based on user feedback and has been fairly stable for the past year or so. |
I'd also like to mention another possibility: the cmp package has excellent functionality for deep-printing I wonder if it might be a good idea to consider including some deep-printing functionality in the standard library. It would probably be best if it was testing-oriented: like cmp, such a package would probably Aside: like @dsnet, I also think that cmp would most happily live at |
@rogpeppe I believe sanity-io/litter can do that, if I'm not mistaken.
|
I brought this up on the mailing list at one point: https://groups.google.com/g/golang-nuts/c/Tn0QeDv6fU8/m/ukcrSF6BCwAJ Deep printing was seen as requiring possibly too much customization, leading to a complex API. And a general mechanism to traverse data structures could live outside the standard library. Edit: related #28141 |
The current version of github.com/pkg/diff.Diff has quadratic behaviour. This means that when we attempt a diff between relatively modest sized files, it's easy to find yourself out of memory. Therefore, when we see a diff between two large files (large defined in terms of number of lines), tersely report that as if the two were binary files, i.e. do not try to render the diff. When github.com/pkg/diff or similar supports a linear algorithm: golang/go#45200 (comment) we can revert this change.
The current version of github.com/pkg/diff.Diff has quadratic behaviour. This means that when we attempt a diff between relatively modest sized files, it's easy to find yourself out of memory. Therefore, when we see a diff between two large files (large defined in terms of number of lines), tersely report that as if the two were binary files, i.e. do not try to render the diff. When github.com/pkg/diff or similar supports a linear algorithm: golang/go#45200 (comment) we can revert this change.
) The current version of github.com/pkg/diff.Diff has quadratic behaviour. This means that when we attempt a diff between relatively modest sized files, it's easy to find yourself out of memory. Therefore, when we see a diff between two large files (large defined in terms of number of lines), tersely report that as if the two were binary files, i.e. do not try to render the diff. When github.com/pkg/diff or similar supports a linear algorithm: golang/go#45200 (comment) we can revert this change.
This proposal has been added to the active column of the proposals project |
I use cmp in many of my projects, and would love to see its capabilities become part of the stdlib. But the current package API is unnecessarily complex and esoteric; IMO it would need to be substantially simplified (read: reduced) before it could be realistically considered for inclusion. |
I'm interested in what aspects of the API you see as possibilities for simplification. Personally I've see it as an example of a nice composable API and can't see how it could be simplified much without removing a lot of its usefulness. As I said above, generics could make the API more obvious. Another example,
|
@leighmcculloch, what version of |
FWIW, as it currently stands, the Option interface is opaque. So any 3rd party implementations must be implemented based on the Options that are already there. Personally, I think this is a bit unfortunate, but I'm not sure how to fix it. In any case, one obvious answer to "the API surface is too big" would be to make it possible/easy to factor out the lesser used options into Personally I'm not convinced |
Oh, also, FWIW: When I used go-cmp in the past, but needed more than |
In addition to the other issues, I'd note: I frequently want to compare slices such that i consider a nil slice to be equivalent to a non-nil slice of len 0, and reflect.DeepEqual has bitten me on that several times, and worse, done so such that a naive printf of the slices (using Writing a function that compares two things and tells me whether or how they differ has usually been a great return on time spent in improving test quality and outputs, but I've written that function for []int far more often than I should. |
@Merovius. Yea, while @seebs, it's not clear to me whether your comment is in support or opposition towards the proposal. Since |
@rsc any update on this since #45200 (comment)? Asking in order to avoid duplication of effort with respect to pkg/diff#26. Thanks |
This proposal issue has been lingering, in large part because we don't really know how to move forward on it. go-cmp is widely used but also very complex. I agree with the comments above that if we were to move forward with something in the standard library, we'd want a significantly trimmed down version. I have read the go-cmp package docs a few times over the past few years and each time, I'm left feeling like I don't fully understand what the package does. Obviously the basic functionality is easy to understand, but all the complications are not. And then you also have to read the cmpopts package. At first glance it's not even obvious that cmpopts can be written in terms of cmp. I initially thought that cmpopts was using some internal tricks to return cmp.Options that cmp knew about but weren't exposed in cmp's API itself. Eventually I figured out that cmpopts is entirely layered on top of cmp.Transformer, cmp.FilterValues, and cmp.Comparer, but it's weird that you have to reach for a different package to get the "simple" helpers. Of course, if it was all one package that's even more API to digest. I believe I met with Joe or Damien or both at one point long ago to discuss potential ways to make cmp fit better into the standard library, but my memory is that there were interface reasons that make it essentially impossible to change any details of cmp without breaking existing uses. I forget the exact details, and maybe I am misremembering. There may well be some reduced form of google/go-cmp that should become testing/cmp. I don't think an unmodified google/go-cmp is that form. And I don't see any clear path forward as far as what exactly to remove. It seems like maybe we should move this proposal toward a decline, which would give clarity to the issue and perhaps open a space for other proposals of simpler APIs that might better fit the standard library. Of course, google/go-cmp will remain for anyone who wants it, same as always. |
Based on the discussion above, this proposal seems like a likely decline. |
No change in consensus, so declined. |
TL;DR, I propose adding
github.com/google/go-cmp/cmp
to the standard library astesting/cmp
.Determining equality of two values is one of the most common operations needed for a unit test where the test wants to know whether some computed value matches some pre-determined expected value.
Using the Go language and standard library alone, there are two primary options:
==
operator from the language itself, andreflect.DeepEqual
function.For simple cases, these work fine, but are insufficient for more complex cases:
==
operator only works on comparable types, which means that it doesn't work for Go slices and maps, which are two common kinds that tests want to compare.reflect.DeepEqual
function is a "recursive relaxation of Go's == operator", but has several short-comings:For many users, the standard library is insufficient, so they turn to third-party packages to perform equality. According to the module proxy, the most common comparison module used is
github.com/google/go-cmp
with 7264 module dependents (and 25th most imported module). I propose includingcmp
in the standard library itself.Why include
cmp
in the standard library?The most widely used comparison function in Go is currently
reflect.DeepEqual
(with ~34k usages ofreflect.DeepEqual
compared to ~6k usages ofcmp.Equal
). Inclusion ofcmp
to the standard library would provide better visibility to it and allow more tests to be written that would have been better off usingcmp.Equal
rather thanreflect.DeepEqual
.A problem with
reflect.DeepEqual
is that it hampers module authors from making changes to their types that otherwise should be safe to perform. Sincereflect.DeepEqual
blindly compares unexported fields, it causes a test to have an implicit dependency on the value of unexported fields in types that come from a module's dependency. When the author of one those types changes an unexported field, it surprisingly breaks many brittle tests. Examples of this occurring was when Go1.9 added monotonic timestamp support and the change to the internal representation oftime.Time
broke hundreds of test at Google. Similar problems occurred with adding/modifying unexported fields to protocol buffer messages.reflect.DeepEqual
is a comparison function that looks like it works wells, but may cause the test to be brittle. Furthermore,reflect.DeepEqual
does not tell you why two values are different, making it even more challenging for users to diagnose such brittle tests.As a contributor to the standard library, there are a number of times that I would have liked to use
cmp
when writing tests in the standard library itself.How is
cmp.Equal
similar or different thanreflect.DeepEqual
?cmp.Equal
is designed to be more expressive thanreflect.DeepEqual
. It accepts any number of user-specified options that may affect how the comparison is performed. Whilereflect.DeepEqual
is more performant,cmp.Equal
is more flexible. Also,cmp.Diff
provides a humanly readable report for why two values differ, rather than simply providing a boolean result for whether they differ.One significant difference from
reflect.DeepEqual
is thatcmp.Equal
panics by default when trying to compare unexported fields unless the user explicitly permits it with ancmp.Exporter
option. This design decision was to avoid the problems observed with usingreflect.DeepEqual
at scale mentioned earlier.Without any options specified,
cmp.Equal
is identical toreflect.DeepEqual
except:cmp.Equal
panics when trying to comparing unexported fieldscmp.Equal
uses a type'sEqual
method if it has onecmp.Equal
has an arguably more correct comparison for graphsreflect.DeepEqual
's cycle detection primarily aims to avoid infinite recursion, but does not necessarily verify whether the two graphs have the same topology, whilecmp.Equal
checks that the graph topology are the same.(Fun fact) Package
reflect
is the 15th most imported Go package, but ~77% of the imports (for test files) only do so to useDeepEqual
. In 2011, Rob Pike explained that "[reflection] is a powerful tool that should be used with care and avoided unless strictly necessary." I find it ironic that most usages ofreflect
is to access a function that is arguably not even about Go reflection (i.e., it doesn't provide functionality that mirrors the Go language itself).How does
cmp
compare to other comparison packages?The only other comparison module (within the top 250 most widely used modules) is
github.com/go-test/deep
with 845 dependents (compared to 7264 dependents forcmp
). Packagedeep
is not as flexible ascmp
and relies on globals to configure how comparisons are performed.The text was updated successfully, but these errors were encountered: