C++20 Format Library for Custom Types

Support the C++20 format library for your custom types

For a long time before C++20’s born, first-party support for string formatting is indeed a mess for C++. printf family inherited from C is known for security concerns and a lack of support for custom format strings and custom types, and the <iostream> library is criticized for ugly grammars and performance issues. C++20 brings us <format> a better way to address string formatting.

<format> Basics

format library is based on the popular placeholder-based syntax for string formatting used by various languages like Python and C#. It provide a type-safe way with great extensibility.

Function std::format is an important part of the library. Here is a simple example:

cout << std::format("{1} {2} {0}", "world", "hello", 1);

Which gives the ouput

hello 1 world

Also, the function std::format_to used with iterators can put formatted strings into containers and streams. For example,

std::ofstream file{ "format.txt" };
std::format_to(std::ostream_iterator<char>(file), "hello, {}!", "world");

will gives a file with content “hello, world!”.
Here, we work with simple types like integers and strings. The post will discuss how to cope with custom types which is commonly seen in any C++ program.

Formatters

In fact, to support std::format, the traditional way for <iostream> library of defining operator<< is still viable, but it is less configurable and also introduce the same performance overhead of the <iostream> library. The newer and better way is to write a custom formatter that specialize the std::formatter template. Formatters have been provided for built-in types and some library components, as listed in the document in Standard specializations for basic types and string types:

template<> struct formatter<char, char>;
template<> struct formatter<char, wchar_t>;
template<> struct formatter<wchar_t, wchar_t>;

template<> struct formatter<CharT*, CharT>;
template<> struct formatter<const CharT*, CharT>;
template<std::size_t N> struct formatter<const CharT[N], CharT>;
template<class Traits, class Alloc>
  struct formatter<std::basic_string<CharT, Traits, Alloc>, CharT>;
template<class Traits>
  struct formatter<std::basic_string_view<CharT, Traits>, CharT>;

template<> struct formatter<ArithmeticT, CharT>;

template<> struct formatter<std::nullptr_t, CharT>;
template<> struct formatter<void*, CharT>;
template<> struct formatter<const void*, CharT>;

To support one for our custom type, say the struct product,

struct product
{
	string brand;
	int value;
	string name;
};

two functions, parse and format should be provided to support parsing the format specifications in the placeholder and to format the given value, repectively.
Typically, format should be always provided, but parse can be simply inherited from std::formatter if no special format specifications is required for the custom type. Here, we support 2 format specifications, d,b, to control the detailed or breif output, respectively, for our product struct. So we implement the formatter as follows:

template<>
struct std::formatter<product>
{
	bool detailed{ false };

	constexpr auto parse(std::format_parse_context& context)
	{
		auto it = context.begin(), end = context.end();
		if (it != end && (*it == 'b' || *it == 'd')) detailed = (*it++) == 'd';

		if (it != end && *it != '}') throw format_error("Invalid format");

		return it;
	}

	template<class FormatContext>
	auto format(
		const product& p,
		FormatContext& context)
	{
		if (detailed)
		{
			return format_to(context.out(), "product({},{},{})", p.brand, p.name, p.value);
		}
		else
		{
			return format_to(context.out(), "product {}", p.name);
		}
	}
};

parse is resposible to parse the format specification and store the settings in data members for format method to use. It should return the iterator past the end of the parsed range according to the requirement. format should put the result in context.out(). So the std::format_to function mentioned above is used.

With the custom formatter, we can format the product struct freely. The following code

product p{ "kanon",5000,"d550" };

cout << std::format("{:d}", p) << endl;
cout << std::format("{:b}", p) << endl;

Will print

product(kanon,d550,5000) product d550

and

cout << std::format("{:c}", p) << endl;

Will raise an exception because c is not a format specification that we support.

In my work, I use <format> with a library magic_enum to support a elegant way for logging enums, which provides a convenient way to debug. For example, in the compiler project clox, I implement a custom formatter for op_code in opcode.h

#include <magic_enum.hpp>

namespace magic_enum
{
template<>
struct customize::enum_range<clox::interpreting::vm::op_code>
{
	static constexpr int min = (int)clox::interpreting::vm::op_code::OPCODE_ENUM_MIN;
	static constexpr int max = (int)clox::interpreting::vm::op_code::OPCODE_ENUM_MAX;
};
}


#include <string>
#include <format>

namespace std
{
template<>
struct std::formatter<clox::interpreting::vm::op_code> : std::formatter<std::string>
{
	auto format(clox::interpreting::vm::op_code op, format_context& ctx)
	{
		std::string op_str{ magic_enum::enum_name(op) };

		return formatter<string>::format(
				op_str, ctx);
	}
};


template<>
struct std::formatter<clox::interpreting::vm::secondary_op_code> : std::formatter<std::string>
{
	auto format(clox::interpreting::vm::secondary_op_code sop, format_context& ctx)
	{
		return formatter<string>::format(
				to_string(clox::helper::enum_cast(sop)), ctx);
	}
};

}

So I can print the name for enum values only by its value. It really helps when dumping the bytecodes.

<format> in the Future

As is said in Microsoft’s blog in Visual Studio 2019 version 16.10, C++23 will likely add compile time format checking to format literals, which will make the new library even faster and avoid runtime exceptions. Also, std::print may come to the library to directly print the format result.

Built with Hugo
Theme Stack designed by Jimmy