P3876R1 — Extending support to more character types
(11 items)
SG16
This paper proposes extending std::to_chars and std::from_chars to support all character types (char8_t, char16_t, char32_t, and wchar_t) in addition to the existing char support. New function templates and corresponding result types (u8to_chars_result, u16to_chars_result, etc.) with alias templates to_chars_result_t and from_chars_result_t are introduced to avoid an explosion of non-template overloads. The proposal also re-specifies std::format's wchar_t formatting path to call the new wchar_t to_chars overload directly rather than transcoding from char output.
- §3.5.2 Summary, second code block (from_chars synopsis) — Floating-point from_chars template constrains U with integer-type-concept instead of floating-point-type-concept, inconsistent with the corresponding to_chars floating-point templates in the same section. [1]
- §3.5.2 Summary, second code block (from_chars synopsis) — New from_chars templates take T* first, T* last (non-const), but the existing overload in the same block and the formal wording in [charconv.from.chars] use const charT*. Should be const T*. [2]
- §6.4 [charconv.from.chars], newly inserted paragraph — Paragraph says from_chars uses an "output style," but from_chars is an input/parsing function. The analogous to_chars paragraph correctly uses "output style" because to_chars produces output; from_chars should reference the analyzed pattern instead. [3]
-
§3.5.1 Result type, code block — using u8to_chars_result = basic_to_chars_result
{ }; is ill-formed -- a using alias declaration cannot have braces after the type-id. Should be either a plain alias or a derived struct. [4] - §3.6, note block — Closing comment delimiter is written as /* instead of */, producing /* .../*) where the closing parenthesis is swallowed into the comment body. [5]
- §5.1 Implementation survey, table row from_chars (integer), libc++ column — Display text reads to_chars_integral.h but the row is for from_chars and the URL points to from_chars_integral.h. [6]
- §3.3.1 Unicode error handling, paragraph after the example block — Extraneous word "code" before "only" makes the sentence ungrammatical. Should read "only code points in the Basic Latin block." [7]
References — Anthropic Citations API
[1]
"using u8to_chars_result = basic_to_chars_result { }; // maybe?"
"using u8to_chars_result = basic_to_chars_result
[2]
"template from_chars_result_t from_chars(T* first, T* last, U& value, chars_format fmt = chars_format::general);"
"template
[3]
"template constexpr from_chars_result_t from_chars(T* first, T* last, U& value, int base = 10);"
"template
[4]
"All Unicode encodings are designed so that code only code points in the Basic Latin block can be encoded with code units in the range [0, 0x7f)."
"All Unicode encodings are designed so that code only code points in the Basic Latin block can be encoded with code units in the range [0, 0x7f)."
[5]
"The output style of all functions named from_chars is specified in terms of characters in the basic character set (and thus in terms of their Unicode code points) or directly in terms of code..."
"The output style of all functions named from_chars is specified in terms of characters in the basic character set (and thus in terms of their Unicode code points) or directly in terms of code..."
[7]
"A possible implementation is to call the `constexpr to_chars(char*, /* .../*)`"
"A possible implementation is to call the `constexpr to_chars(char*, /* .../*)`"
Summary: Proposes class templates basic_to_chars_result and basic_from_chars_result plus new function templates for to_chars and from_chars that accept char8_t, char16_t, char32_t, and wchar_t, extending beyond char. Seven defects were found spanning ill-formed syntax, incorrect template constraints, missing const qualifiers, and wording inconsistencies.
Pipeline: Discovery (Anthropic Opus + Citations API) → Verification Gate (OpenRouter Opus) → Report Writer (OpenRouter Opus)
Provenance: All references are machine-verified character positions from the Anthropic Citations API — deterministic, exact substrings, not model-generated quotes.
Provenance: All references are machine-verified character positions from the Anthropic Citations API — deterministic, exact substrings, not model-generated quotes.