Document: P3844R4
Author: Matthias Kretz
Date: 2026-02-13
Audience: LWG
Matthias Kretz is back with another C++26 std::simd wording cleanup — this one fixing a genuine footgun lurking in [simd.math]. The issue: the old GENERATIVE_MATH_FUNCTION macro pattern uses math-common-simd-t<V0, V1, ...> to compute a common SIMD type across mixed scalar/SIMD arguments. That unification happens inside the function's deduction machinery, which means any implicit conversion — including a consteval constructor — runs after the math function is called rather than at the call site. Under C++23's immediate escalation rules (P2564), a consteval constructor invoked in a non-constant context requires its arguments to be constant expressions. If you're passing a runtime value, that's a hard error.
The fix rewrites the overload sets: instead of one variadic template per function, you get explicit overloads where scalar arguments are typed as const deduced-vec-t<V>&. This forces the scalar-to-SIMD conversion — including any consteval constructor — to happen at the call site, which is exactly how scalar <cmath> has always worked via arithmetic promotions. Semantics unchanged; overload resolution ordering preserved; the wording just now says when conversions happen.
This is one of several simd/consteval papers Kretz filed in the February 2026 mailing — see also P3932R0 (integer-from defect fix), P4012R0 (consteval broadcast extension at LEWG), and P3978 (constant_wrapper interaction). P3844R4 is pure LWG wording: design was settled in prior revisions. If the wording survives LWG review intact, it's on track for C++26.
Reminder: paper authors occasionally read these threads. Keep it technical.
Great, another paper that trades one macro-expansion mechanism for seventeen hand-written overloads. C++ solving problems with complexity since 1998.
To be fair, the paper is fixing a correctness bug, not adding features. The overload count is a consequence of moving conversions to where they belong, not a design choice made for its own sake.
This one actually bit us. We have a SIMD wrapper type with a
constevalbroadcast constructor — lets you writevec<float, 4> v = 3.14f;in constant-expression contexts. When we upgraded to C++23 and started callingstd::sinon mixed scalar/SIMD expressions, the compiler started rejecting code that had compiled for years.The bug is subtle:
math-common-simd-t<V, scalar>computes the unified return type across all arguments simultaneously, which means the implicit conversion from scalar → vec is part of the function-call machinery, not a call-site conversion. With aconstevalconstructor, P2564's immediate escalation rules require the argument to be a constant expression if the conversion runs in a non-constant context. Runtimefloat? Hard error.Scalar
<cmath>never had this problem because standard arithmetic promotions happen at the call expression, before the function body executes. The newdeduced-vec-t<V>overloads mirror that: the scalar parameter is already typed as the element-broadcast type, so the conversion is forced at the call site.Compact paper, correct fix. Dense wording but the logic is sound.
Was this broken pre-C++23? Immediate escalation is a C++23 thing — did the old code just silently do something wrong instead of erroring?
Pre-C++23 the
constevalconstructor would just... not run in that context, silently. You'd get a regular constructor call and lose the constant-expression guarantee you wanted. Not a compile error, but not correct behavior either. Immediate escalation (P2564) turned the silent misbehavior into a diagnostic. So the old code wasn't "wrong" in the pre-C++23 world, but it wasn't doing what theconstevalannotation intended.Can we get networking in the standard before I retire. I don't care about consteval simd math overloads.
Networking is in C++26. It got merged. You made it.
Wait, actually? Edit: checking. Edit2: I need to lie down.
Meanwhile Rust's
portable_simdjust implements math via scalar fallback and calls it a day. No consteval drama, no overload combinatorics. Almost like a type system that doesn't have thirteen implicit conversion mechanisms sidesteps some problems.Rust also doesn't have
constevalin the C++ sense, andportable_simdwas nightly-only until very recently. Different problem space. The whole point of Kretz's design is thatconstevalconstructors work in a SIMD context — that's the feature, not the bug.Sir, this is a C++ subreddit.
Read through the diff carefully. The core design is right but the wording has problems that will catch LWG's attention.
The
remquoadditional overloads declarequoby value:But the Effects clause in §5.4 says "Sets
*quotoret.second" — dereferencing a pointer that isn't there. The primary overload correctly usesrebind_t<int, V>* quo. The additional overloads need the*. This is semantically breaking if adopted as-is.Also: the
llrintsynopsis and definition have inverted signatures between §5.3 and §5.4 — the parameter and return types are swapped between them. And there are missing commas in some three-argument overloads in the synopsis that produce syntactically invalid C++.Paper is R4. Needs R5.
The
atan2additional overloads also swap the(y, x)parameter names to(x, y)relative to the primary overload. Mathematically irrelevant but inconsistent with the rest of theatan2spec and theatan2(y, x)convention everywhere else in the standard.The real problem isn't the consteval interaction — it's that
GENERATIVE_MATH_FUNCTIONwas always the wrong abstraction. You're now trading one variadic macro for (let me count) two additional overloads per two-argument function, three permutations for each three-argument function, across ~40 math functions. That's somewhere between 80 and 120 new declarations in the standard text. When the next edge case hits, someone has to touch every single one.The paper acknowledges the proliferation but waves it off as unavoidable. I'd like to see the argument that no deduction-based approach could have preserved call-site conversion order without this explosion.
The fundamental constraint is that to force conversions at the call site, the converted type has to appear in the parameter list — not be deduced from a metafunction applied to all arguments simultaneously.
math-common-simd-tcan't do this structurally; the unification happens after deduction, which is too late forconsteval. You'd need either CTAD tricks that create new ambiguity surfaces, or constraints that amount to the same explicit overloads written differently.The
deduced-vec-t<V>alias is the invariant point — if scalar/SIMD broadcast semantics ever change, you update the alias. The individual overloads just delegate to the primary; the math logic stays in one place. The count is tedious but not a maintenance bomb.Okay the alias-as-single-invariant argument is fair, I hadn't thought of it that way. Still uneasy about this in wording terms — does the paper demonstrate that the new overloads don't introduce ambiguities when you call
hypot(vec_float, vec_float)with two identical SIMD types? Partial ordering should handle it but the paper doesn't say so explicitly.const V&is a strictly better match thanconst deduced-vec-t<V>&when the argument is alreadyV— partial ordering covers this. But you're right that the paper doesn't include an explicit note confirming overload resolution intent. Worth raising in LWG review. Filing mental issue.Fair. Objection withdrawn pending LWG confirming the overload resolution intent in the specification notes.
Forty math functions × three-arg permutations × ABI tag variants = my compile times crying softly in the corner. C++ where fixing one correctness issue means teaching the compiler about 120 new overload candidates.
Worth reading this alongside the other Kretz papers in the Feb 2026 mailing. P3932R0 fixes a related LWG defect on integer-from in
[simd], P4012R0 extends consteval broadcast to LEWG, and P3978 coversconstant_wrapperinteraction with simd broadcasts. This is a coordinated cleanup of the C++26 simd specification, not a one-off. The papers make more sense together — P3844R4 handles the[simd.math]overload sequencing, while P4012 handles the higher-level LEWG design question of when value-preserving consteval broadcasts should be allowed.Genuine question: who is shipping
std::simd(notstd::experimental::simd, not Vc) in production today? Not a gotcha — actually wondering how much real-world deployment is behind these papers.HEP (high-energy physics) has been running Vc in production for over a decade across reconstruction pipelines at CERN. The migration path to
std::simdis active in several major frameworks. Kretz'sstd-simdreference implementation sees real usage. The math functions are not hypothetical exercice — they get called millions of times per event reconstruction.Laughs in STM32. We don't have heap allocation, let alone vectorized trig.
std::simdis nice for general code. When you actually need performance you write intrinsics by hand and LLVM cleans up the rest. The consteval ergonomics are a user-land problem that doesn't affect anyone at the nanosecond tier.The consteval story is specifically useful for compile-time-evaluated code that also runs at runtime — think broadcast constants, lookup table initialization, policy types. It's not about the nanosecond tier at all.
[deleted]
what did they say
Something about ISPC being the real answer. You know the one.
R4 and it still has wording defects. Truly a WG21 paper.
godbolt.org — because you need to see the assembly.
Early bird registration open. The conference for the C++ community.