C++ string formatting library.
Sean Middleditch sean@middleditch.us
formatxx is a modern C++ string formatting library. Its intended goals are to offer fast compilation times, minimal binary bloat, and reasonable speed. Final measurements of these goals are not yet available; the library is still in preliminary development.
The library supports writing primitive types as well as user-defined types into string formatting buffers. The libray has as little dependence on the C++ standard library as possible, which is intended to make it a very light header to include throughout a larger C++ project.
The included string writers allow for formatting into a std::string
-compatible type, a fixed-
size buffer with guaranteed no allocations, or a buffer that initially has a fixed-size buffer
but can grow to accomodate larger strings. The combination of these buffers allow for easy use
in three major cases: quick creation of std::string
values, use in assert handlers that
cannot allocate, and use in log systems where allocation should be avoided but is allowed when
necessary. Users can easily write their own buffer systems as well.
The underlying method of operation of formatxx is to collect a list of arguments via variadic
templates, lookup a format_value
function for each of those arguments, and then pass the format
string, an array of format functions, and an array of void
pointers to arguments into the
the actual formatting function. Builtin C++ types go through a slightly different mechanism for
the sake of avoiding excessing ADL noise. The header mechanisms that generate these lists of
functions and pointers are intended to be light-weight on the compiler to the extent reasonable.
The actual formatting work is all implemented in a source file and not the header, to keep the
header small and cheap to include.
formatxx can write into user-defined buffers, or a user may use one of the provided buffer
types. Formatting is support for any type that has an appropriate format_value
free function with
the signature void format_value(formatxx::IWriter&, TheType, formatxx::format_options const&)
.
For instance:
#include <formatxx/format.h>
#include <string>
struct Foo { int value };
void format_value(formatxx::writer& out, Foo const& foo, formatxx::format_options const& opts) {
format_to(out, "Foo({})", foo.value);
}
int main() {
std::cout << formatxx::format_as<std::string>("testing {0}", Foo{123});
}
The above will print testing Foo(123)
to standard output.
The options
argument are additional options passed to the formatter. These are parsed from
format string options. In the format_to
case user-provided arguments may be provided,
otherwise they are interpreted by Python or printf rules.
The functions formatxx::parse_format_spec
and formatxx::parse_printf_spec
can be used
to interpret format specs from string inputs.
The formatxx::format_as<ResultT>(string_view, ...)
template can be used
for formatting a series of arguments into any result type that implements a string
-like
append
method. Including formatxx/std_string.h
also provides a format_string
function
that defaults to returning std::string
results.
The formatxx::format_to(formatxx::writer&, string_view, ...)
template can be used to
write into a write buffer. This is the recommended way of formatting.
The provided write buffers are:
formatxx::append_writer<StringT>
- writes to astring
-like object usingappend
.fmt::container_writer<ContainerT>
- writes to a container usinginsert
at the end.fmt::span_writer<CharT>
- writes to a pre-allocated buffer.
All three of the provided write buffers guarantee NUL-terminated strings, but support use with string types that are not NUL-terminated (another important use case for formatxx).
The library that motivated this author to write formatxx is the excellent
fmtlib (previously cppformat) by Victor Zverovich and
contributors. fmtlib does much of what this library does, has excellent runtime speed, and
responsive and talented authors. Unfortunately at the time the cppformat library failed to
meet a few needs that formatxx seeks to address: it had relied on some heavy standard headers,
its error handling mechanisms were and are limited to C++ exceptions or abort
, the
compilation time overhead was large, and its support for formatting user-defined types relied
on complicated and expensive mechanisms like std::ostream
wrappers. For those reasons,
research into formatxx began.
The initial design did not use a format_value
function for each type. Instead, an enumeration
for the basic categories of primitives (bool
, signed long
, double
, etc.) was computed via
a template, values would be cast into their desired "native" representation, and that would
be passed into the format function. Unfortunately, this added a lot of heavy template machinery
to the header: the enum selection, a std::tuple
for storing the converted inputs, etc. It
also proved to be difficult to get good support for format_value
functions for user-defined
types with clean and concise error messages.
We also try to very heavily minimize header dependencies, especially on certain problematic
C++ headers. For example, the core library tries to avoid including <string>
due to that
header's particularly heavy cost in some implementations. This likewise precludes formatxx
from using other standard library types like std::string_view
or so on. In some cases we
try to avoid standard headers but fallback on including them for wide compatibility; for
example, we attempt to use the common __builtin_strlen
support on major compilers but will
fallback to std::char_traits
when necessary.
The current header relies on a function template wrapper around the real formatting functions.
This is one template more than is desired that will lead to object file bloat, and for
unoptimized debug builds essentially means that all format_value
functions get an extra
unnecessary "trampoline" that can be very heavyweight with many compilers' debug checks
enabled. Ideally, these templates would be externed or converted away from wrappers. One
solution would be to make the templates' functions for primitive types directly be the
formatters for those types and then make format_value
be the template wrapper (since it is
expected to be more rarely used; it mostly exists to keep the same API between primitives and
user-defined types for generic user code).
Each format_value
is responsible currently for its own formatting and even its own format
specifier parsing. This is not necessarily ideal and may change in the long run to standardize
better for at least things like alignment and padding.
A particular limitation with printf
-like features right now is that the argument array
abstracts away integral arguments such that argument precision (e.g. %.*s
) cannot be
supported. A fix for this may be to bind a set of simple types instead of binding format
functions directly to instead form a simple struct with unique identifiers for primitive types
and only bind the format functions for user-provided types. This would allow the format code
to parse out integral types efficiently, and possibly also better-optimize primitive formatting.
To the point possible, we use modern C++ and only work with recent compilers. In some cases, we're held back to slightly older compilers. We currently require C++17 support and so require recent Visual C++ 19.x (2015 SP 3 or later), Clang 3.9, and GCC 6.3
A final note is that floating point formatting currently bounces through snprintf
which
necessarily means that our floating point performance is slower and more complicated than
calling the CRT directly. Writing a correct floating point formatter is incredibly complex. C++
now includes helpers like to_chars
but it is not yet widely available on enough standard
library implementations for us to rely on exclusively.
- Performance pass
- Benchmarks
- Correctness to wide/unicode char support
- u8/u16/u32?
- maybe just remove?
This is free and unencumbered software released into the public domain.
Anyone is free to copy, modify, publish, use, compile, sell, or distribute this software, either in source code form or as a compiled binary, for any purpose, commercial or non-commercial, and by any means.
In jurisdictions that recognize copyright laws, the author or authors of this software dedicate any and all copyright interest in the software to the public domain. We make this dedication for the benefit of the public at large and to the detriment of our heirs and successors. We intend this dedication to be an overt act of relinquishment in perpetuity of all present and future rights to this software under copyright law.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
For more information, please refer to http://unlicense.org/