Document: P3876R1
Authors: Jan Schultke, Peter Bindels
Date: 2026-02-22
Audience: SG16 (Unicode and text)
std::to_chars and std::from_chars have existed since C++17, and they are genuinely great — fast, locale-independent, no allocation, no exceptions. There is exactly one problem: they only accept char*. Want to serialize a number into a std::u8string? Want to parse one from a wchar_t buffer on Windows? You either reinterpret_cast your way toward UB or bounce through a temporary char buffer and copy. Neither is great.
P3876R1 proposes adding function template overloads for all five character types: char (the existing overloads, unchanged), wchar_t, char8_t, char16_t, and char32_t. The key insight that makes this clean: every character to_chars ever emits — digits 0–9, letters a–z, the minus sign, a decimal point — lives in the Basic Latin / ASCII block (U+0030–U+007A). Unicode encodings guarantee that code units for non-ASCII code points are always ≥ 0x80, so from_chars can safely ignore them without any encoding-specific logic whatsoever.
The design has one awkward corner: to_chars_result cannot be turned into a class template without breaking ABI and aggregate-initialization syntax for all existing code. The paper's solution is to add four new named result structs (u8to_chars_result, u16to_chars_result, u32to_chars_result, wto_chars_result) plus alias templates (to_chars_result_t<T>) so generic code stays clean. The alias maps char to the existing to_chars_result — no type identity breakage for old code.
SG16 has had an open issue on this gap since issue #38. This revision also coordinates with P3652R1 (constexpr floating-point charconv) and supersedes LWG4522.
Reminder: paper authors sometimes read these threads. Keep discussion technical and constructive.
If you have ever written Windows code interfacing with
LPCWSTRAPIs and tried to serialize a number into that buffer without an extra allocation — you understand exactly why this paper exists.The
reinterpret_cast<wchar_t*>(char_buf)version is what I see in codebases that gave up. This paper makes it unnecessary.That is every Windows API ever. You are not alone.
I love that the argument for using templates here is not "templates are elegant" but "the alternative is literally 110 function declarations that grow to 300 with
<stdfloat>." Sometimes the committee gets there.Wait, why do we have
char8_tagain? I thoughtcharwith UTF-8 source was already fine.chardoes not guarantee UTF-8.char8_tdoes — it is explicitly the type for UTF-8 encoded data. Without it you cannot distinguish a byte array from a UTF-8 string at the type level. P2238 covers the full motivation if you want the deep dive.skill issue
The result-type design in section 3.5 is worth slowing down on. The paper adds eight new named types —
u8to_chars_result,u16to_chars_result,u32to_chars_result,wto_chars_result, and four matchingfrom_charsvariants. That looks like an API surface explosion on first read.The steelman: you cannot make
to_chars_resulta class template without breaking existing code. The C++17 API is out in the wild with aggregate initialization —to_chars_result{ptr, ec}— and name-mangled into compiled object files. Making it an alias forbasic_to_chars_result<char>would change the type identity in every TU that has already compiled ato_charscall. That is an ABI break with no opt-out.The insight that makes the paper's approach actually work:
to_chars_result_t<char>maps to the existingto_chars_result. Not a new type. Not a structural alias for a new type. The actual same concrete struct. This means:auto r = std::to_chars<CharT>(buf, buf + N, value)and storing the result asto_chars_result_t<CharT>compiles cleanly for all five character types.to_chars_resultdirectly sees no change whatsoever.The alias template is doing exactly what a type parameter on the original struct would have done if the C++17 designers had thought ahead. It just lives in a different place in the standard.
The verdict: this is the right call given the constraints. The lesson for future API designers: template your result types from day one, even if you ship only one instantiation. The standard is paying a naming tax here because charconv did not.
This line alone in the design section probably saved us from a very bad day at some future mailing.
Good breakdown. The alias identity question was the first thing I checked when I saw the new result types — if
to_chars_result_t<char>were NOT the same type as the existingto_chars_result, you would get silent breakage in any generic helper that already stores results in a concreteto_chars_resultvariable. The paper threads this correctly.I still think they could have done:
and made the template primary. Why not go that route instead of adding eight new named types?
Section 3.5.1 addresses this directly. The problem with the base-class approach — or the
using to_chars_result = basic_to_chars_result<char>approach — is that it changes the type identity ofto_chars_result. They would be structurally identical but different types.std::is_same_v<to_chars_result, basic_to_chars_result<char>>fails. Any TU that compiled against the original struct and a TU that compiled against the new alias have different mangled names for the return type ofto_chars. That is an ODR violation waiting to happen at link time, and a hard ABI break for any precompiled library.The paper is explicit: even the base-class inheritance route "technically breaks the API" in subtle ways — aggregate status, reflection behavior, type identity in overload resolution. You are not adding a supertype; you are changing what
to_chars_resultis.Wait — if
using to_chars_result = basic_to_chars_result<char>makes them different types, that meansto_chars_resulton one side andbasic_to_chars_result<char>on the other never unify in template argument deduction either. So you cannot even write a single generic function that accepts either. That is worse than I thought.Exactly. Which is why the paper's solution is the only option that does not break anything: keep the five existing struct definitions untouched, add four new named structs for the other character types, and provide
to_chars_result_t<T>as an alias template that maps eachTto the correct struct. Every type is a distinct, well-named concrete struct. Generic code uses the alias and works. Old code uses the concrete name and works. Nobody's ODR gets violated.It is more API surface than you would want. It is also the only correct answer given the constraints.
Want to master C++ templates in 30 days? Over 50,000 students already enrolled — check out [link removed by moderator]
[removed by moderator]
what did they say
Rule 3.
Rust-related, I assume.
Solid paper. My one concern: are the function template instantiations going to meaningfully add to compile times in large TUs that include
<charconv>? Five character types across 22+ arithmetic types is a lot of potential instantiation surface.Templates are only instantiated on use. If your TU never calls
to_chars<char8_t>(...), the compiler sees the template definition and moves on. The instantiation surface only materializes if you actually exercise the new overloads. The header overhead from adding template declarations is negligible.I am aware of how templates work, thank you.
Worth reading section 3.4 if you track
std::format. The paper explicitly supersedes LWG4522 — which Schultke also filed — about whetherstd::format(wformat_string<...>)transcodes throughcharor calls thewchar_toverload ofto_charsdirectly. Currently the standard wording is arguably wrong (LWG4522 is the proposed fix). Onceto_chars<wchar_t>exists, you call it directly and the transcoding question goes away.The coordination risk: if LWG adopts 4522's wording before this paper clears SG16 → LEWG → LWG, you have two conflicting wording changes targeting the same paragraph. Schultke calls this out directly in the paper, but it depends on sequencing that the committee does not always get right. Something to watch.
SG16 generally moves faster on these coordination issues than the wider committee does. The realistic risk is if LWG4522 ships in C++26 and then this paper also targets C++26 — but given R1 only just appeared in the 2026-02 mailing, C++29 seems more likely. Plenty of time to sort the sequencing.
Looking at section 5.2 — as someone who has actually shipped C++ on z/OS with EBCDIC: yes, the
_Encode<charT>(char32_t code_point)approach from the libc++ review is the right call. On EBCDIC systems,wchar_tdoes not encode the Basic Latin block at the same code points as Unicode. You cannot just writestatic_cast<wchar_t>('+')and expect it to beU+002B PLUS SIGN— it is EBCDIC0x4Eon the wire.The paper's implementation sketch using compile-time dispatch on the character type to invoke an EBCDIC encoder is how you actually make this work portably. It is nice to see it addressed instead of footnoted.
Appreciate the confirmation from someone who has actually operated in that environment. The paper notes the libc++ EBCDIC path was reviewed separately — good to know the approach holds up.
Great paper. Now when do we get
<networking>so I can actually send the numbers I just serialized somewhere?One thing I missed in my earlier comment: the floating-point overloads have a conditional
constexprdependency on P3652R1. Integer conversion is unconditionallyconstexprfor all five character types (that already holds forchar). But float conversion isconstexprforcharonly after P3652R1 ships, and the new template overloads follow the same condition.If P3652R1 stalls, you end up with a split API: integer
to_charsisconstexpracross all char types, floatto_charsis not constexpr for any of them. That is probably fine — it mirrors the existing state — but it means the "upgrade path" for constexpr float formatting in non-charcontexts depends on two papers landing in the right order. Worth tracking if you care about compile-time number formatting.Edit: the paper does acknowledge this dependency explicitly in section 3.6. Not a hidden issue, just easy to miss on first read.
godbolt.org — test your to_chars calls before the paper even ships.
Early bird registration open. The conference for the C++ community. Join us in September.