C++20 Language Lawyer – How to Handle std::variant::operator< Unexpected Call to Implicit Bool Conversion Across Standards

c++c++20language-lawyer

I'm seeing some unexpected behavior when using the std::variant::operator<. In the situation where the type has an implicit bool conversion operator and its less operator is not a member function (in C++20 with mscv 19.38 compiler).

#include <variant>

struct Foo {
    int x;
    int y;

#ifndef DROP_CAST_OP
    constexpr operator bool() const { return x || y; }
#endif

#ifdef USE_SPACESHIP
    constexpr auto operator<=>(const Foo&) const noexcept = default;
#else
    friend constexpr bool operator<(const Foo& a, const Foo& b) noexcept
    {
        return a.x < b.x || (a.x == b.x && a.y < b.y);
    }
#endif
};

using TestVariant = std::variant<Foo, int>;

constexpr Foo fooA { 0, 1 };
constexpr Foo fooB { 1, 0 };
constexpr std::variant<Foo, int> varA = fooA;
constexpr std::variant<Foo, int> varB = fooB;

static_assert(fooA < fooB);
static_assert(varA < varB);

https://godbolt.org/z/1zfq5dq1r

Note that assertion starts to pass when one of the following conditions is met:

  • use C++17 instead C++20
  • use three-way comparison operator instead free function less operator
  • not defining implicit conversion to bool operator
  • marking conversion bool operator as explicit

All compilers have the same behavior.

Best Answer

Heh, I knew exactly what this code would be when I read the title. I can't find a great dupe target so I'll try to make this the canonical answer.

C++17

In C++17, std::variant (like a bunch of other class templates in the standard library, std::pair, std::tuple, and std::optional among them) define < in terms of deferring to the underlying types' <. The only operation invoked on the underlying type was T.

Specifically, what operator< would do on two objects of type variant<T, U> (assuming < was defined for both T and U) is first compare the indices and if those were the same, compare the values. Something like this:

bool operator<(variant<T, U> const& lhs, variant<T, U> const& rhs) {
    if (lhs.index() != rhs.index()) {
        return lhs.index() < rhs.index();
    }
   
    // not this specifically, but this conceptually
    return std::get<lhs.index()>(lhs) < std::get<rhs.index()>(rhs);
}

C++20

C++20 introduced <=>, which is generally a much better way of dealing with ordering and came with a lot of conveniences to make writing comparisons (equality and ordering) easier. But it also came with the problem that no code before C++20 had <=> available. So we can't wholesale just change std::variant's comparison to use <=> because no existing code uses <=>.

Instead, the library preferentially uses <=> but falls back to < if <=> isn't available. It does so with a specification-only object called synth-three-way, specified in [expos.only.entity]:

  constexpr auto synth-three-way =                 // exposition only
    []<class T, class U>(const T& t, const U& u)
      requires requires {
        { t < u } -> boolean-testable;
        { u < t } -> boolean-testable;
      }
    {
      if constexpr (three_way_comparable_with<T, U>) {
        return t <=> u;
      } else {
        if (t < u) return weak_ordering::less;
        if (u < t) return weak_ordering::greater;
        return weak_ordering::equivalent;
      }
    };

It's pretty straightforward: if <=> is available, we really want to use <=>. But if <=> isn't available, we fall back to what we had to do in C++17 and use <.

And this has the behavior you want.

Except when... it doesn't.

Let's look back at your type:

struct Foo {
    int x;
    int y;

#ifndef DROP_CAST_OP
    constexpr operator bool() const { return x || y; }
#endif

#ifdef USE_SPACESHIP
    constexpr auto operator<=>(const Foo&) const noexcept = default;
#else
    friend constexpr bool operator<(const Foo& a, const Foo& b) noexcept
    {
        return a.x < b.x || (a.x == b.x && a.y < b.y);
    }
#endif
};

We can go through the various behavior. I'm assuming here that we always provide exactly one of < or <=>:

standard provide operator bool which ordering what happens
c++17 no < compares with <
c++17 yes < compares with <
c++20 no < compares with <
c++20 yes (implicit) < compares the result of conversion to bool (see below)
c++20 yes (explicit) < compares with <
c++20 no <=> compares with <=>
c++20 yes (implicit) <=> compares with <=>
c++20 yes (explicit) <=> compares with <=>

Keep in mind, the rule is: if <=> works, use <=>, otherwise fall back to <. However, we don't have a mechanism in the language to check how <=> works.

When you provide a <=> to compare the Foos, then <=> exists and is viable and is the best option, so it's unsurprising that it is used.

When you provide a < to compare the Foos, that doesn't in of itself necessarily mean that <=> isn't viable. When you provide an implicit conversion to bool, then f1 <=> f2 is still viable - it evaluates as (bool)f1 <=> (bool)f2 because the builtin candidates are available. This isn't specific to bool - any builtin type (like int or char const*) or other type for which ADL can find a candidate would lead to the same behavior. So according to the language, comparing two Foos with <=> works just fine - so that's the mechanism that we prefer in the library. It's just that in this specific case, it gives surprising behavior, since you probably preferred the explicit < over the implicit <=> by way of the implicit bool conversion.

That's why marking the conversion operator explicit fixes the problem - the builtin operator<=>(bool, bool) is no longer a viable candidate, so there is no viable way to invoke <=> on two Foos. Hence the library falls back to using <.

Note that this isn't even a new problem. If Foo had provided an implicit conversion to bool, but neither an operator< nor a operator<=>, even in C++17 the variant comparison would still work: by way of the implicit conversion to bool. Because evaluating t < u would be a valid expression by way of that conversion. The only novel thing here is that because of the prioritization of <=>, even providing an < doesn't ensure that the library uses the comparison operator that you wrote.


This is an issue that keeps coming up, because people write types that have explicit comparison operators (via <) but also provide an implicit conversion function to a type that has a builtin <=>. Any library mechanism that detects the presence of <=> will give a false positive here, and the only solution is either to provide an explicit <=> yourself or make the conversion function explicit instead of implicit.

If we had a language mechanism to figure out what specifically t <=> u invoked (and there is one proposed in P2825), then we could add additionally validation that we only select <=> if t <=> u and t < u are both viable and invoke the same kind of thing (i.e. that they both invoke the same operator<=> or if the latter invokes a function named operator< that both functions take the same parameter types). But until that happens, be careful with implicit conversion functions in the presence of <=>.

Related Question