r/wg21
P3844R4 - Reword [simd.math] for consteval conversions LWG
Posted by u/cpp26_release_watcher · 9 hr. ago

Document: P3844R4

Author: Matthias Kretz

Date: 2026-02-13

Audience: LWG

Matthias Kretz is back with another C++26 std::simd wording cleanup — this one fixing a genuine footgun lurking in [simd.math]. The issue: the old GENERATIVE_MATH_FUNCTION macro pattern uses math-common-simd-t<V0, V1, ...> to compute a common SIMD type across mixed scalar/SIMD arguments. That unification happens inside the function's deduction machinery, which means any implicit conversion — including a consteval constructor — runs after the math function is called rather than at the call site. Under C++23's immediate escalation rules (P2564), a consteval constructor invoked in a non-constant context requires its arguments to be constant expressions. If you're passing a runtime value, that's a hard error.

The fix rewrites the overload sets: instead of one variadic template per function, you get explicit overloads where scalar arguments are typed as const deduced-vec-t<V>&. This forces the scalar-to-SIMD conversion — including any consteval constructor — to happen at the call site, which is exactly how scalar <cmath> has always worked via arithmetic promotions. Semantics unchanged; overload resolution ordering preserved; the wording just now says when conversions happen.

This is one of several simd/consteval papers Kretz filed in the February 2026 mailing — see also P3932R0 (integer-from defect fix), P4012R0 (consteval broadcast extension at LEWG), and P3978 (constant_wrapper interaction). P3844R4 is pure LWG wording: design was settled in prior revisions. If the wording survives LWG review intact, it's on track for C++26.

▲ 47 points (88% upvoted) · 30 comments
sorted by: best
u/r_cpp_janitor 1 point 9 hr. agoModerator

Reminder: paper authors occasionally read these threads. Keep it technical.

u/constexpr_everything_lol 78 points 8 hr. ago

Great, another paper that trades one macro-expansion mechanism for seventeen hand-written overloads. C++ solving problems with complexity since 1998.

u/template_count_enjoyer 34 points 7 hr. ago

To be fair, the paper is fixing a correctness bug, not adding features. The overload count is a consequence of moving conversions to where they belong, not a design choice made for its own sake.

u/immediate_escalation_casualty 89 points 7 hr. ago

This one actually bit us. We have a SIMD wrapper type with a consteval broadcast constructor — lets you write vec<float, 4> v = 3.14f; in constant-expression contexts. When we upgraded to C++23 and started calling std::sin on mixed scalar/SIMD expressions, the compiler started rejecting code that had compiled for years.

The bug is subtle: math-common-simd-t<V, scalar> computes the unified return type across all arguments simultaneously, which means the implicit conversion from scalar → vec is part of the function-call machinery, not a call-site conversion. With a consteval constructor, P2564's immediate escalation rules require the argument to be a constant expression if the conversion runs in a non-constant context. Runtime float? Hard error.

Scalar <cmath> never had this problem because standard arithmetic promotions happen at the call expression, before the function body executes. The new deduced-vec-t<V> overloads mirror that: the scalar parameter is already typed as the element-broadcast type, so the conversion is forced at the call site.

Compact paper, correct fix. Dense wording but the logic is sound.

u/consteval_curious_dev 19 points 6 hr. ago

Was this broken pre-C++23? Immediate escalation is a C++23 thing — did the old code just silently do something wrong instead of erroring?

u/immediate_escalation_casualty 28 points 5 hr. ago

Pre-C++23 the consteval constructor would just... not run in that context, silently. You'd get a regular constructor call and lose the constant-expression guarantee you wanted. Not a compile error, but not correct behavior either. Immediate escalation (P2564) turned the silent misbehavior into a diagnostic. So the old code wasn't "wrong" in the pre-C++23 world, but it wasn't doing what the consteval annotation intended.

u/asio_holdout_forever 156 points 7 hr. ago

Can we get networking in the standard before I retire. I don't care about consteval simd math overloads.

u/executor_enjoyer_2019 43 points 6 hr. ago

Networking is in C++26. It got merged. You made it.

u/asio_holdout_forever 67 points 5 hr. ago

Wait, actually? Edit: checking. Edit2: I need to lie down.

u/portable_simd_missionary -14 points 6 hr. ago

Meanwhile Rust's portable_simd just implements math via scalar fallback and calls it a day. No consteval drama, no overload combinatorics. Almost like a type system that doesn't have thirteen implicit conversion mechanisms sidesteps some problems.

u/cpp_doesnt_have_that_irl 37 points 5 hr. ago

Rust also doesn't have consteval in the C++ sense, and portable_simd was nightly-only until very recently. Different problem space. The whole point of Kretz's design is that consteval constructors work in a SIMD context — that's the feature, not the bug.

u/the_real_rustacean_here 71 points 5 hr. ago

Sir, this is a C++ subreddit.

u/lwg_wording_archaeologist 52 points 6 hr. ago

Read through the diff carefully. The core design is right but the wording has problems that will catch LWG's attention.

The remquo additional overloads declare quo by value:

remquo(const V& x, const deduced-vec-t<V>& y, rebind_t<int, deduced-vec-t<V>> quo)

But the Effects clause in §5.4 says "Sets *quo to ret.second" — dereferencing a pointer that isn't there. The primary overload correctly uses rebind_t<int, V>* quo. The additional overloads need the *. This is semantically breaking if adopted as-is.

Also: the llrint synopsis and definition have inverted signatures between §5.3 and §5.4 — the parameter and return types are swapped between them. And there are missing commas in some three-argument overloads in the synopsis that produce syntactically invalid C++.

Paper is R4. Needs R5.

u/also_caught_that_one 29 points 5 hr. ago

The atan2 additional overloads also swap the (y, x) parameter names to (x, y) relative to the primary overload. Mathematically irrelevant but inconsistent with the rest of the atan2 spec and the atan2(y, x) convention everywhere else in the standard.

u/overload_resolution_hater_42 18 points 5 hr. ago

The real problem isn't the consteval interaction — it's that GENERATIVE_MATH_FUNCTION was always the wrong abstraction. You're now trading one variadic macro for (let me count) two additional overloads per two-argument function, three permutations for each three-argument function, across ~40 math functions. That's somewhere between 80 and 120 new declarations in the standard text. When the next edge case hits, someone has to touch every single one.

The paper acknowledges the proliferation but waves it off as unavoidable. I'd like to see the argument that no deduction-based approach could have preserved call-site conversion order without this explosion.

u/deduced_vec_defender 33 points 4 hr. ago

The right fix would have been a deduction-based approach that preserved call-site conversion order

The fundamental constraint is that to force conversions at the call site, the converted type has to appear in the parameter list — not be deduced from a metafunction applied to all arguments simultaneously. math-common-simd-t can't do this structurally; the unification happens after deduction, which is too late for consteval. You'd need either CTAD tricks that create new ambiguity surfaces, or constraints that amount to the same explicit overloads written differently.

The deduced-vec-t<V> alias is the invariant point — if scalar/SIMD broadcast semantics ever change, you update the alias. The individual overloads just delegate to the primary; the math logic stays in one place. The count is tedious but not a maintenance bomb.

u/overload_resolution_hater_42 11 points 3 hr. ago

Okay the alias-as-single-invariant argument is fair, I hadn't thought of it that way. Still uneasy about this in wording terms — does the paper demonstrate that the new overloads don't introduce ambiguities when you call hypot(vec_float, vec_float) with two identical SIMD types? Partial ordering should handle it but the paper doesn't say so explicitly.

u/deduced_vec_defender 22 points 3 hr. ago

const V& is a strictly better match than const deduced-vec-t<V>& when the argument is already V — partial ordering covers this. But you're right that the paper doesn't include an explicit note confirming overload resolution intent. Worth raising in LWG review. Filing mental issue.

u/overload_resolution_hater_42 8 points 2 hr. ago

Fair. Objection withdrawn pending LWG confirming the overload resolution intent in the specification notes.

u/translation_unit_trauma 24 points 5 hr. ago

Forty math functions × three-arg permutations × ABI tag variants = my compile times crying softly in the corner. C++ where fixing one correctness issue means teaching the compiler about 120 new overload candidates.

u/wg21_mailing_completionist 41 points 4 hr. ago

Worth reading this alongside the other Kretz papers in the Feb 2026 mailing. P3932R0 fixes a related LWG defect on integer-from in [simd], P4012R0 extends consteval broadcast to LEWG, and P3978 covers constant_wrapper interaction with simd broadcasts. This is a coordinated cleanup of the C++26 simd specification, not a one-off. The papers make more sense together — P3844R4 handles the [simd.math] overload sequencing, while P4012 handles the higher-level LEWG design question of when value-preserving consteval broadcasts should be allowed.

u/production_simd_skeptic 31 points 4 hr. ago

Genuine question: who is shipping std::simd (not std::experimental::simd, not Vc) in production today? Not a gotcha — actually wondering how much real-world deployment is behind these papers.

u/hep_cpp_in_the_wild 48 points 3 hr. ago

HEP (high-energy physics) has been running Vc in production for over a decade across reconstruction pipelines at CERN. The migration path to std::simd is active in several major frameworks. Kretz's std-simd reference implementation sees real usage. The math functions are not hypothetical exercice — they get called millions of times per event reconstruction.

u/not_a_physicist_firmware 14 points 2 hr. ago

Laughs in STM32. We don't have heap allocation, let alone vectorized trig.

u/avx512_intrinsics_purist 7 points 4 hr. ago

std::simd is nice for general code. When you actually need performance you write intrinsics by hand and LLVM cleans up the rest. The consteval ergonomics are a user-land problem that doesn't affect anyone at the nanosecond tier.

u/immediate_escalation_casualty 21 points 3 hr. ago

The consteval story is specifically useful for compile-time-evaluated code that also runs at runtime — think broadcast constants, lookup table initialization, policy types. It's not about the nanosecond tier at all.

u/committee_gonna_committee_irl 63 points 3 hr. ago

R4 and it still has wording defects. Truly a WG21 paper.