Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C20 concepts are structural: What why and how to change it? #9809

Open
guevara opened this issue Sep 19, 2023 · 0 comments
Open

C20 concepts are structural: What why and how to change it? #9809

guevara opened this issue Sep 19, 2023 · 0 comments

Comments

@guevara
Copy link
Owner

guevara commented Sep 19, 2023

C++20 concepts are structural: What, why, and how to change it?



https://ift.tt/wQMcCL2






C++20 concepts are structural: What, why, and how to change it?

C++20 added concepts as a language feature. They’re often compared to Haskell’s type classes, Rust’s traits or Swift’s protocols.

Yet there is one feature that sets them apart: types model C++ concepts automatically. In Haskell, you need an instance, in Rust, you need an impl, and in Swift, you need an extension. But in C++? In C++, concepts are just fancy boolean predicates that check for well-formed syntax: every type that makes the syntax well-formed passes the predicate and thus models the concepts.

This was the correct choice, but is sometimes not what you want. Let’s explore it further.

Nominal vs. structural concepts

To co-opt terms from type systems, C++20 concepts use structural typing: a type models the concept if it has the same structure as the one required by the concept, i.e. it the has required expressions. On the contrast, type classes, traits and protocols all use nominal typing: a type models the concept only if the user has written a declaration to indicate it.

For example, consider a C++ concept that checks for operator== and operator!=:

template <typename T> concept equality_comparable = requires (T obj) {  { obj == obj } -> std::same_as<bool>;  { obj != obj } -> std::same_as<bool>; }; 

This is how you write a type that models equality_comparable with C++20’s structural concepts:

// Define your type, struct vec2 {  float x, y;   // define the required operators,  friend bool operator==(vec2 lhs, vec2 rhs)  {  return lhs.x == rhs.x && lhs.y == rhs.y;  }   // operator!= not needed in C++20 due to operator rewrite rules! };  // ... and that's it! static_assert(equality_comparable<vec2>);

In contrast, this is how you would write a type that models equality_comparable in a hypothetical C++20 with nominal concepts:

// Define your type struct vec2 {  }; // as before  // ... and tell the compiler that it should be `equality_comparable`. // Most languages also support a way to define the operation here. concept equality_comparable for vec2; 

Nominal is better…

In my opinion, nominal concepts are superior to structural concepts:

  1. Structural concepts do not allow for semantic differences between concepts, because that is not part of the “structure”.

    Consider the standard library concept std::relation; it is true for predicate types R that describe a binary relation between types T and U:

    template <typename F, typename ... Args> concept predicate  = /* F can be invoked with Args returning bool */;  template <typename R, typename T, typename U> concept relation = predicate<R, T, T> && predicate<R, U, U>  && predicate<R, T, U> && predicate<R, U, T>; 

    Binary relations are broad mathematical terms, so often you want a relation with specific properties. For example, std::ranges::sort takes a function that controls the sort, which must be a special relation: a strict weak order. Luckily, there is a standard library concept std::strict_weak_order:

    template <typename R, typename T, typename U> concept strict_weak_order = relation<R, T, U>; 

    However, it is just std::relation! Whether you use requires std::strict_weak_order<R, foo, bar> or requires std::relation<R, foo, bar> makes as much difference as calling your template parameters RandomAccessIterator. It’s just a fancy comment; the compiler doesn’t care.

    Semantic differences that cannot be expressed in the C++ type system can’t be expressed with structural concepts either. With nominal concepts, a function object would need to explicitly opt-in to strict_weak_order, which allows differentiating between the two.

  2. With structural concepts, names of functions are really important (ironic, I know). If you write code that interacts with the standard library (or other libraries using concepts) in any way, you need to follow the same naming convention. Names like size or begin or iterator are essentially globally reserved and must mean the thing the standard library concepts intend.

    class TShirt { public:  enum Size  {  small,  medium,  large  };   // The size of the T-Shirt.  Size size() const;   // The text on the front of the T-Shirt.  const std::string& front() const;  // The text on the back of the T-Shirt.  const std::string& back() const; }; 

    The TShirt class above might be mistaken for some sequence container like std::vector as it passes the syntactic checks of corresponding concepts. However, with nominal concepts it would need to explicitly opt-in; no type will model a nominal concept if the author did not intend it.

  3. On the flip side, if we have something that conceptually models a concept, but uses different names for the required methods, it does not work – as the name is what matters.

    Suppose vec2 from above didn’t overload operator== but instead provided a function bool is_equal():

    struct vec2 {  float x, y;   bool is_equal(vec2 rhs) const  {  return x == rhs.x && y == rhs.y;  } }; 

    Even though the type is equality comparable, it is not equality_comparable – names matter. With nominal concepts, the declaration that opts-in to a concept usually also provides a way to specify the actual implementation of the required functions. That way, you can easily adapt existing types to other interfaces:

    // Dear compiler, vec2 models equality_comparable and here's how: concept equality_comparable for vec2 {  bool operator==(vec2 lhs, vec2 rhs)  {  return lhs.is_equal(rhs);  } } 

    One can imagine that the names introduced there are scoped to the concept: They don’t add members to the type itself and are instead only available in generic code that wants equality_comparable types.

…but structural is what C++ needs

So if I believe that nominal concepts are better, why did I say in the introduction that structural concepts were the correct choice for C++? Because structural concepts have one big advantage: they’re convenient when faced with code written before concepts!

Just imagine if every function conceptified in C++20 requires you to explicitly opt-in to the concepts: you can’t use std::ranges::sort() until you’ve written dummy declarations for your containers, your iterators, your types, … It would be a migration nightmare! It’s much easier if the concept is modeled automatically.

Another advantage is library interoperability: if you have three libraries A, B, and C, where A has a concept, B has a type that models the concept, and C uses the two, C can just pass the type of B to functions expecting A’s concept without B having to depend on A or C. You can write types that adhere to concepts without pulling in the library that actually defines them, which is convenient when you want to avoid a big dependency yet still allow your code to seamlessly work with it.

This argument is sometimes phrased slightly differently: C can pass B’s type to A’s concept without B ever knowing about the existence of A’s concept. However, this does not happen: you don’t write a type that just so happens to model a concept that you’re unaware of. What does happen is that you write a type that models an existing concept, which just hasn’t been formalized in code yet.

Finally, sometimes a naming convention is just so universally accepted that nobody would ever dare and deviate from it – think operators. If you’re copy assignment doesn’t do a copy, or your move constructor doesn’t move, your type is bad. It thus makes total sense to have concepts like std::copyable be modeled automatically.

Note that all three advantages don’t apply to “new” languages, i.e. where concepts are part of it from the start:

  • A new language has no legacy code, so there is no migration cost to annotating every concept your type models.
  • A new language can provide a standard package manager, which makes it less necessary to avoid dependencies to model concepts.
  • Instead of having operator overloading and concepts that check for their existence, you can flip it on its head: Define a concept that provides the operator overloads; type that opt-in to the concept get the corresponding overloaded operator.

As such, the decision of Haskell, Rust, and Swift makes perfect sense.

However, when you invent completely novel concepts for a library or actually need to distinguish between different concepts based on semantics – and don’t just want “fancy comments”, you might want nominal concepts in C++.

So what do you do?

Nominal concepts in C++20

The problem of differentiating between concepts with identical interface but different semantics dates back to C++98 – iterators. An input iterator and a forward iterator have (almost?) the same interface, yet are not interchangeable: once you advance an input iterator it is gone and you’ll never get the old value back; with a forward iterator, you can copy it and retain the old value.

template <typename InputIterator> void handle_input(InputIterator begin, InputIterator end) {     auto a = *begin;   auto copy = begin;  ++begin;  auto b = *begin;      auto c = *copy;  assert(c == a); // ups, c is actually the same value as b! } 

Because copying an input iterator doesn’t really make sense as all input iterators share the same state, in C++20 the conceptified std::input_iterator does not require the existence of a copy constructor, and many input iterators of the standard library are move-only types.

This is one of my two favorite features of the new iterator model, besides sentinels.

So how can code distinguish between an input iterator and a forward iterator? Simple: we add some syntax that distinguishes them.

In the case of iterators, every iterator has an associated iterator_category typedef that explicitly states whether something is an input iterator (std::input_iterator_tag) or a forward iterator iterator (std::forward_iterator_tag). In fact, there are iterator categories for all iterator categories as C++98 wasn’t really great for detecting the interface of a type and doing overloading based on that…

However, the basic idea to distinguish semantic properties using tag types was kept for the new C++20 iterator concepts. The required typedef is now called iterator_concept for reasons, but it also looks for iterator_tag.

Technique #1: add extra syntax like a dummy typedef that distinguishes between otherwise identical concepts.

// concept definition ===// template <typename T> concept my_concept  = requires { typename T::my_concept_tag; }  && ;  //=== concept modelling ===// struct my_type_modelling_the_concept {  using my_concept_tag = void; // Doesn't matter. }; 

Another case is the distinction between std::range and std::view. A std::view is a std::range (something with begin/end) that is also moveable, but where move and copy operations (if provided) happen in constant time. So crucially, std::vector<T> is not a std::view: it has begin/end, is moveable (and even copyable) but copy operations are certainly not in O(1)! As such, std::vector<T> is not a std::view – which is again impossible to detect by a compiler because it has the same syntax.

So to model a std::view a type has to opt-in by specializing the variable template std::enable_view to set it to true:

namespace my_namespace {  class MyViewtype  {  public:  iterator begin() const;  iterator end() const;  }; }  namespace std {  // Tell the compiler that your view is a view.  template <>  constexpr bool enable_view<my_namespace::MyViewType> = true; } 

If you compare this with the equality_comparable nominal concept example from above, you’ll note that it basically looks the same! We formally fulfill the syntactic requirements for our type, and then write some extra declaration to indicate that we’d like to model the concept. It’s just purely implemented in the library, instead of the core language.

However, specialization of std things is annoying (close the current namespace, open namespace std, write a template<>, …), so there is also an easier way to opt-in: you simply inherit from std::view_base.

namespace my_namespace {  // Tell the compiler that your view is a view.  class MyViewtype : public std::view_base  {  public:  iterator begin() const;  iterator end() const;  }; } 

This is not inheritance with virtual functions or CRTP (although there is also a CRTP base class for views) or anything like that: std::view_base is simply an empty type. Its only there to be able to provide a syntactic requirement that can be checked by the non-specialized version of std::enable_view:

namespace std {  struct view_base  {};   // By default, a type is a view iff it inherits from view_base.  template <typename T>  constexpr bool enable_view = std::is_base_of_v<view_base, T>; } 

Technique #2: enable a concept by specializing a variable template and/or inheriting from a tag type

//=== concept definition ===// struct my_concept_base {};  template <typename T> constexpr bool enable_my_concept  = std::is_base_of_v<my_concept_base, T>;  template <typename T> concept my_concept = enable_my_concept<T>  && requires (T obj) {  };  //=== concept modelling ===// struct my_type_modelling_the_concept : my_concept_base {   }; 

The extra layer of indirection added by the variable template is only necessary, if some types want to model my_concept but can’t inherit from my_concept_base (non-class types, preexisting types). If you’re adding a completely new concept that is only ever modeled by classes, you can just use std::is_base_of_v directly.


I really like the “enable a concept by inheriting from a tag type” idiom (EACBIFATT?): it provides nominal concepts with minimal syntactic overhead to opt-in. We can also extend the base class to inject default implementations for optional functionality, which can be “overriden” by simple name hiding.

The technique also has the “advantage” of adding your library’s namespace to the list of namespaces found via ADL of your users types!

For lexy, where rules inherit from lexy::dsl::rule_base, this means it finds the DSL operator overloads automatically, so that worked out nicely. For all other libraries, it is probably not what you want.

Now you might wonder: if users need to explicitly inherit something anyways, why not use that alone to constrain the function? After all, it worked for iterators since C++98.

However, consider the case where a type claims to model a concept, but actually doesn’t. With the additional syntax checks, you’ll get an error message when trying to call the function. Without concepts, it is somewhere in the internals when the code tries to use the type.

Whether or not that is worth it, is up to you. For example, lexy, which supports C++17, can only use concepts by hiding them behind ugly macros. As such, I didn’t bother to properly conceptify my concepts and only use the existence of base classes.

Reverse nominal concepts

On the flip side, sometimes you don’t want to explicitly opt-in to a concept, but to opt-out.

For example, a std::sized_range is a std::range with a size() function that returns the size in constant time. Again, this cannot be verified by the compiler, so there needs an additional nominal check. We can again throw EACBIFATT on it, but this would be annoying: most size() functions are O(1).

So instead the logic is reversed: by default types model the concept if they fulfill the syntactic requirements, unless you’ve opted-out by specializing disable_sized_range.

namespace std {  // MyLinkedList has O(n) size.  template <typename T>  constexpr bool disable_sized_range<MyLinkedList<T>> = true; } 

Technique #3: explicitly disable a concept by specializing a variable template

template <typename T> constexpr bool disable_my_concept = false;  template <typename T> concept my_concept = !disable_my_concept<T>  && requires (T obj) {  }; 

Note that we could again provide the tag type to inherit, but inheriting something to opt-out seems weird.

Conclusion

C++20 concepts are automatically modeled based on syntax; it doesn’t care about semantics.

As such, if you want to distinguish between identical syntax with different semantics, you need to introduce some syntax to distinguish it. A nice way is to check for the existence of a base class: types can easily opt-in by inheriting from it. You can also add typedefs or variable specializations. The same approach can also be used to opt-out of a concept.

C++ enthusiast. I write libraries.







via foonathan::​blog()

September 19, 2023 at 07:12PM
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant