My good friends Claudio Russo and Andrew Kennedy have been kind enough to send me a draft paper about Generalized Algebraic Data Types (GADTs) and Object-Oriented Programming. GADTs generalize the datatypes of ML and Haskell by permitting constructors to produce different type-instantiations of the same datatype. One of Andrew and Claudio’s examples is a slightly modified version of Peter Sestoft’s example of representing abstract syntax trees for typed expressions in Generic C#. To get a better feeling of the difference between the generics in C#/Java and the generics in C++ I have rewritten the the example in C++.
To recap the example: we want to represent abstract syntax trees for a tiny language of simple expressions. Futhermore, we want representation to be type safe. That is, we do not want to be able to represent the nonsense expression “true + 42“. We shall represent the abstract syntax trees the standard OO way with an abstract base class Exp for the general type of expressions and the concreate subclasses for each node type. To enforce the type safety of our embedded language we use generics. That is, a value with type Exp<R> is an abstract syntax tree for an expression that evaluates to a value of type R.
First the base class.
Peter’s Generic C# version:
abstract class Exp<R> { // R = result type
abstract public R eval();
}
C++ version:
template< typename R > // R = result type
class Exp {
public:
virtual R eval() const = 0;
virtual ~Exp() {}
};
Note that in C++ we need to remember to define a virtual destructor because we plan to inherit from Exp and we also want to be able to dynamically allocate subclasses of Exp.
Next up is the class for literals.
Peter’s Generic C# version:
class Lit<R> : Exp<R> {
private readonly R v;
public Lit(R v) {
this.v = v;
}
public override R eval() {
return v;
}
}
C++ version:
template< typename R >
class Lit : public Exp<R> {
public:
virtual R eval() const {
return val;
}
explicit Lit(R val_) : val(val_) {}
private:
R const val;
};
No surprises here.
But for binary expressions like plus, for instances, it starts to get a bit more tricky. Here I’ll use a simpler version of the class for representing plus nodes than the version Peter uses.
Generic C# version:
class Plus : Exp<int> {
private readonly Exp<int> lhs, rhs;
public Plus(Exp<int> lhs, Exp<int> rhs) {
this.lhs = lhs; this.rhs = rhs;
}
public override int eval() {
return lhs.eval() + rhs.eval();
}
}
First cut at a C++ version:
class Plus : public Exp<int> {
public:
virtual int eval() const {
return lhs.eval() + rhs.eval();
}
explicit Plus(Exp<int> const lhs_, Exp<int> const rhs_)
: lhs(lhs_), rhs(rhs_)
{}
private:
Exp<int> lhs, rhs;
};
Alas, this is not valid C++ because we are not allowed to declare fields and parameters of type Exp<int> because this class has an abstract virtual function, namely eval (and besides if we could, for example by modifying the class Exp, the code would not do what we expects). These problems has nothing to do with the generics in C++, they are just the usual C++ OO quirks. And the solution is straightforward: we must use a pointer to values of type Exp. But then we have to make decisions about ownership: should we make a deep copy of subexpressions? should we share subexpressions? and if so who should take take of freeing resources? should we use ref counting? or what? Here we shall just use one of the the wonderful smart pointers from Boost: shared_ptr which will take care of the refcounting and deletion of expressions. Thus, first we make a couple of convenient typedefs for integer expressions and shared pointers to integer expressions (and similar for boolean expressions):
typedef Exp< int > IntExp;
typedef boost::shared_ptr< IntExp > IntExpPtr;
Then the class for plus nodes looks like this:
class Plus : public IntExp {
public:
virtual int eval() const {
return lhs->eval() + rhs->eval();
}
explicit Plus(IntExpPtr const & lhs_, IntExpPtr const & rhs_)
: lhs(lhs_), rhs(rhs_)
{}
private:
IntExpPtr lhs, rhs;
};
and a wrapper for constructing shared pointers for new plus expressions:
IntExpPtr plus(IntExpPtr const & lhs, IntExpPtr const & rhs) {
return IntExpPtr( new Plus( lhs, rhs ) );
}
Similar to the Plus class we can define a class for representing condtional expressions:
Generic C# version:
class Cond<R> : Exp<R> {
private readonly E<bool> cond;
private readonly E<R> truth, falsehood;
public Cond(E<bool> cond, E<R> truth, E<R> falsehood) {
this.cond = cond; this.truth = truth; this.falsehood = falsehood;
}
public override R eval() {
return cond.eval() ? truth.eval() : falsehood.eval();
}
}
C++ version:
template < typename R >
class Cond : public Exp<R> {
public:
typedef boost::shared_ptr< Exp<R> > SubExpPtr;
virtual R eval() const {
return cond->eval() ? truth->eval() : falsehood->eval();
}
explicit Cond(BoolExpPtr const & cond_,
SubExpPtr const & truth_,
SubExpPtr const & falsehood_) :
cond(cond_), truth(truth_), falsehood(falsehood_)
{}
private:
BoolExpPtr cond;
SubExpPtr truth, falsehood;
};
The interesting things to note about this class are that Cond contains subexpressions of different types, that Cond is polymorphic in the result type, that Cond enforces that the two branches of a condtional expression must have the same type, and the only one of the branchs is evaluated based on the evaluation of the condition.
Using these classes we can define a function that builds an expression for calculating the n‘th fibonacci number:
IntExpPtr fib(int n) {
IntExpPtr fib0( lit(0) ), fib1( lit(1) );
switch ( n ) {
case 0: return fib0;
case 1: return fib1;
default:
for(int i = 1; i != n; ++i) {
IntExpPtr tmp( fib1 );
fib1 = plus( fib0, fib1 );
fib0 = tmp;
}
return fib1;
}
}
This function builds an expression that takes linear, O(n), space, but it will use exponential time, O(2n), to be evaluated.
Conclusion
Non-surprising this GADT example can be translated more or less straight forward from generinc C# to C++. The things that are a bit tedious are just the normal C++ oddities and has nothing to do with generics. In fact, we would have had the exact same problems without generics.
Next up is to try and translate a few more of Claudio and Andrews examples. The statically typed printf and the typed LR parsing looks nifty.
Recent Comments