August 29, 2012
A tagged union (also known as a variant) is an aggregation of two or more types. An object of such a type can be any one of a predefined set of types. They are insanely useful for modelling many real-world concepts. A network connection might be either ready, open or disconnected – and each of these states has different pieces of data attached to them. A message that comes over that network connection might be a heartbeat, broadcast packet, or an RCP call – and again, each type is associated with a certain set of data fields. Tagged unions pop up everywhere in the real world, and the core C++ language has no decent way to model this concept.
Enter Boost Variant, a library that provides tagged unions to C++. Conceptually, it sounds like just what we want – a Boost Variant is a template type that you can instantiate to hold a client-defined subset of types, and it provides protections to make sure you correctly unwrap the variant to the right type. The problem is, to take advantage of this type safety involves writing a lot of ugly code!
I will present in this post my library function that sprinkles in C++11 lambdas to make Boost Variants a little more palatable.
Take as a case study, the "motivating example" from the Boost Variant documentation:
#include <boost/variant.hpp>
#include <iostream>
class my_visitor : public boost::static_visitor<int>
{
public:
int operator()(int i) const
{
return i;
}
int operator()(const std::string & str) const
{
return str.length();
}
};
int main()
{
boost::variant< int, std::string > u("hello world");
std::cout << u; // output: hello world
int result = boost::apply_visitor( my_visitor(), u );
std::cout << result; // output: 11 (i.e., length of "hello world")
}
Besides the obvious weight of the class definition, notice that the logic you want to run is now removed from the original context in which you wanted to run it.
To make it better, we turn to C++11 lambdas. Here is the code rewritten to use the library presented here:
#include <boost/variant.hpp>
#include "inline_variant.hpp"
#include <iostream>
int main()
{
boost::variant< int, std::string > u("hello world");
std::cout << u; // output: hello world
int result = match(u,
[](int i) -> int { return i; },
[](const std::string & str) -> int { return str.length(); });
std::cout << result; // output: 11 (i.e., length of "hello world")
}
The magic part is the call to the match
function, which accepts the variant
followed by the handler functions. The functions can be specified in any order.
A compiler error will be thrown if the functions do not match the types of the
variant, if the return types are not all the same, or if a non-unary function is
supplied.
This is a C++11 only library because it's quite pointless without lambda functions. I've tested it with gcc 4.7. It doesn't work on gcc 4.6, which can't handle the variadic templates.
If you think this sounds useful, check the inline_variant_visitor project on my Github.
This was quite a fun library to implement – behind the scenes, it involves Boost MPL and Fusion metaprogramming, as well as C++11 variadic templates. I will try to explain what each of these mean in the rest of this post.
A metaprogram is a program that operates on a program. In this case, we are talking about compile-time metaprogramming on types. We dive into a world where our program runs during compilation, and the inputs and outputs are types.
Boost MPL is like the STL for metaprogramming. It gives you a library of type
containers – such as vectors and maps of types. For example,
boost::mpl::vector<int, string, char*>
is a vector of the types int
,
string
and char*
. For my variant visitor, I use an MPL vector to store a
list of the function types that are passed into match
.
Boost Fusion is the bridge between the compile-time world and the runtime world.
Like MPL, it gives you containers of types, but these containers can be
instantiated at runtime to hold values. For example,
boost::fusion::vector<int, string, char*>
not only represents the list of
types, but can be instantiated to actually hold an int
, string
and char*
with specific values. It's actually equivalent to a tuple. But it doesn't stop
there – more interesting is the Fusion map. To illustrate, take this Fusion data
structure (I will omit the boost::fusion
namespace for concision):
typedef map<
pair<int, double>,
pair<char, std::string> > example_map_type;
This means that the type int
is associated with some value which is a
double
, and the type char
is associated with some value which is a string
.
Let that sink in for a bit. The left side (the key) of the map is a type. The
right side of the map is a runtime value. What value? You decide that by
constructing it:
example_map_type m(
make_pair<int>(3.14)
, make_pair<char>("Pie"));
So we've just made a map that says int
maps to 3.14, and char
maps to the
string "Pie". How does this help us? Inside my match function (instantiated in
the example above), lives a type that looks something like this:
map<
pair<int, std::function<int (int)> >
pair<std::string, std::function<int (std::string)> >
This associates the type int
with a function that accepts an int
and returns
an int
, and it associates the type std::string
with a function that accepts
a std::string
and returns an int
. Given this compile-time/runtime data
structure, all that's left in order to perform the function dispatch is to
determine the type held in the variant and look up that type in the map to get
the right function, which is then invoked.
I hope that made sense, because it's a heavily abridged account of what the code does. Perhaps a more thorough walkthrough of the implementation could be a topic for a future post.