Document: P3941R2
Author: Dietmar Kühl
Date: 2026-02-23
Audience: SG1, LEWG, LWG
If you've been using std::execution coroutines — specifically std::execution::task — and wondering how affine_on actually guarantees a coroutine resumes on the correct scheduler after every co_await, this paper is the answer. Or rather, it's the answer to five US National Body ballot comments (US 232–236) pointing out that the current C++26 specification doesn't deliver the guarantee it advertises.
The headline change: affine_on loses its explicit scheduler parameter. Instead of affine_on(sndr, sch) you write affine_on(sndr), and the scheduler is obtained from the receiver's environment at connect-time. That's the right time — when you're building a work graph the receiver isn't known yet, so the old signature was pushing the scheduler in too early. Small fix, correct semantics.
The more interesting changes are downstream. A new get_start_scheduler query is introduced, cleanly separating "what scheduler should new work go to" from "what scheduler started this operation" — two questions that get_scheduler currently conflates. A hard infallibility requirement is added: schedulers used with affine_on must have no set_error completions and no set_stopped completions when given an unstoppable token, and this is checked statically. And change_coroutine_scheduler — the mechanism for switching execution contexts mid-coroutine — is removed entirely.
That last one is the spicy bit. The paper argues the removal is correct: objects created before a scheduler change have their destructors run on the wrong scheduler, which is unsound. The recommended replacement is co_await on(sch, nested-task). The correctness argument holds. The ergonomic cost is real. Worth reading if you have coroutines touching multiple schedulers.
Reminder: paper authors occasionally read these threads. Keep it technical and constructive.
P2300 has more follow-on papers than most languages have standard library entries. We are never going to reach inbox zero on sender/receiver.
The P2300 cinematic universe. P2300 is Iron Man. P3941 is, I don't know, the fourth Ant-Man. Necessary for canon. Watched by nine people.
Tokio figured this out.
spawn_onplus task-local storage. Zero papers, zero NB ballot comments, ships in a library version. Asking for a friend: why does this require a wording paper.Tokio's executor model gives you 'this task runs on a specific worker thread.' That's a weaker property than scheduler affinity in a typed execution framework — you're not statically guaranteeing which kind of execution agent you resume on, just which instance of one thread pool. Different problem space. But I understand the exhaustion.
different language, different memory model, different committee politics, I remain unmoved
Great, another paper addressing NB ballot comments on a coroutine scheduling primitive that no major compiler has shipped an implementation of yet. The abstraction is being standardized before the implementation exists and nobody at the WG21 level sees a problem with that.
The infallibility requirement is doing a lot of work here, and the consequence is worth spelling out.
The paper requires that schedulers used with
affine_onbe infallible: noset_errorcompletions, and noset_stoppedcompletions when given an unstoppable token. Of the four standard schedulers, three can satisfy this:inline_scheduler,task_scheduler, andrun_loop::scheduler. The one that cannot isparallel_scheduler.Here's the irony:
parallel_scheduleris the scheduler you'd actually reach for when CPU affinity matters. NUMA-aware computation, pinning work to a specific core family for cache locality, game engine systems that must stay on their designated thread — all of these reach for a parallel thread pool. The schedulers that can satisfy infallibility are either doing no scheduling at all (inline_scheduler, which runs inline on whatever thread callsstart) or are single-threaded event loops (run_loop::scheduler). You get a statically-correct affinity guarantee in exactly the scenarios where you either don't need it or have already achieved it by other means.The correctness argument for the constraint is sound — an
affine_onthat might fail to reschedule provides no guarantee at all, and silent wrong-scheduler execution is worse than a compile error. But the practical consequence is that the feature is only available for the trivial cases, and production users with bounded thread pools have no specified path until someone writes the wrapper the paper mentions but doesn't define.This is what 'ships correct but incomplete' looks like.
Wait — so the primary use case for scheduler affinity (keeping parallel compute on the right CPU topology) is precisely what's excluded by the infallibility requirement? I must be misreading the paper.
You're reading it correctly. The constraint follows from a real correctness requirement: if
start()of the rescheduling operation can fail, control could return on an arbitrary execution agent and the affinity guarantee collapses entirely. I've hit this exact failure mode with an earlier stdexec build where a bounded thread pool was under heavy load — silent scheduler changes, no diagnostics, genuinely nasty bug to track down.The paper's position is: ship a correct feature that covers the cases we can guarantee, note that fallible schedulers can be handled later with an explicit opt-out parameter. That's defensible. My concern is whether LEWG accepts that trajectory or asks for the opt-out now before C++26 closes.
I'd push back on framing the exclusion as a limitation rather than a feature. A scheduler that can fail under resource pressure cannot satisfy scheduler affinity — period. 'Best-effort affinity' is a different, weaker contract, and conflating the two in a single API name would be worse than what's here.
What I'd have liked to see is an explicit
best_effort_affine_onor a template parameter marking the weaker contract. But collapsing both behaviors into one name and adding a runtime fallback would mean users can't reason about which guarantee they actually have. The hard exclusion is honest. Grumpy about parallel_scheduler, fine. But the alternative — a silently weakened guarantee — would be worse.A separate name for the weaker guarantee is reasonable and I don't entirely disagree. But the paper's own future directions section says:
So they know the constraint is tight. They're taking the safe path now, expecting it to loosen. Fine as a strategy — unless 'later' means C++29 and
parallel_schedulerusers have no specified path for an entire standard cycle. These NB ballot comments are against the C++26 working draft. That's a timeline issue, not a design purity issue.Okay, that's the one thing I'll concede. The design is correct. The timeline is the problem. If the infallibility constraint ships in C++26 and the relaxation mechanism doesn't arrive until C++29, then
parallel_schedulerusers have three years without a specified path forward. That's a gap in the committee's work plan, not in the paper's logic. Worth tracking.The
change_coroutine_schedulerremoval is getting less attention than it deserves. The correctness argument is real — if you change the scheduler mid-coroutine, destructors of objects created before the change execute on the new scheduler, which is wrong. But the recommended replacement isn't equivalent for all use cases.The nested version scopes the scheduler change properly. But it forces you to restructure flat sequential code into nested lambdas, which interacts with local variable capture, error propagation, and coroutine stack depth. A flat coroutine doing a dozen operations across two or three schedulers becomes a tree of lambdas. Not wrong. But the migration path for existing code that used
change_coroutine_scheduleris non-trivial, and the paper doesn't provide a guide.I originally read this as
change_coroutine_schedulerbeing fixed rather than removed, which changed my whole take. Ignore what I said in the other thread.Edit: Also misread the infallibility requirement as applying to the coroutine rather than the scheduler. This is what skimming at midnight produces. My analysis was wrong in two independent ways simultaneously.
affine_onis genuinely one of the worst names in the proposal. 'Affine' in type theory means a resource that can be used at most once. 'Affine' in geometry means a linear transform without the origin constraint. Here it means 'has affinity to a scheduler.' The Venn diagram of people who know what scheduler affinity means and people who will be confused by the word 'affine' is precisely the LEWG attendee list.The paper lists it as an open issue:
So everyone knows. There is no replacement proposal. WG21 naming discussions are their own sub-genre of infinite regress — see also:
mdspan,stop_source, everyranges::adapter.One angle nobody's hit yet: the infallibility check is purely static. It verifies completion signatures at compile time. A scheduler that wraps a fallible thread pool but lies about its completion signatures — advertising no
set_errorin its type while callingset_errorat runtime — passes the check and violates the contract silently.This isn't unique to this paper; the entire sender/receiver model trusts that types accurately describe their behavior. But for
affine_onspecifically, the infallibility guarantee is load-bearing in a way that other sender properties aren't. If it fails you don't get a compile error or a runtime exception — you get silent execution on the wrong scheduler. In embedded contexts where every execution-context transition gets audited, I'd want a debug-mode runtime assertion path, not just a type-level check. The paper doesn't mention this failure mode.requires infallible_scheduler<std::decay_t<Sch>>— I love it when correctness requirements become template constraints that generate 47 lines of error output explaining that your scheduler does not meet the infallibility criterion because itsschedulesender's completion signatures include... honestly I stopped reading at line 23. The constraint is right. The diagnostic is going to be spectacular.Master C++ concurrency in 30 days! Threads, coroutines, std::execution — complete course for working engineers. Early access pricing ends Friday. Learn more
report, flag, move on
Coming back to this after thinking about it more: the
get_start_scheduleraddition is the quietest change in the paper but probably the most load-bearing one for future compatibility. Every scheduler-aware algorithm that needs to answer 'where did this operation originate' will depend on this query. Currentlyget_schedulerserves two conflicting purposes — 'post new work here' and 'this operation was started here' — and the conflation has caused problems across multiple P2300 follow-on papers.Adding a distinct query now, before more algorithms take a dependency on the ambiguous behavior, is the right call even if it looks like a footnote in this revision. Future-Dietmar will thank current-Dietmar.
Early bird registration open now. The conference for the C++ community. Talks on std::execution, coroutines, and everything that hasn't shipped yet.
godbolt.org — because you need to see whether your infallible_scheduler concept fires before or after lunch.