[GpGiki 대문으로]

Cpp Expression Template


표현식 템플릿(Expression Templates) - Todd Veldhuizen

관련 자료:

표현식 템플릿은 David Vandervoorde에 의해 독립적으로 발명되었음( Expression templates were invented independently by David Vandevoorde).


This paper was presented at the 1994 C++ World Conference in Austin, Texas,and later appeared in the June 1995 issue of C++ Report Magazine http://www.sigs.com/publications/cppr/.It has been reprinted in the book "C++ Gems"edited by Stanley Lippman.

T. Veldhuizen, "Expression Templates," C++ Report, Vol. 7 No. 5June 1995, pp. 26-31

C++ Report (ISSN 1040-6042) is published nine times per year by SIGS Publications Inc., 71 West 23rd St., 3rd Floor, New York, NY 10010. ? Copyright 1995 SIGS Publications Inc.

Back to theBlitz++ Home Page http://oonumerics.org/blitz/

Back to Todd's Home Page http://extreme.indiana.edu/~tveldhui


Expression Templates 표현식 템플릿

저자: Todd Veldhuizen

Abstract: Expression Templates is a C++ technique for passing expressions as function arguments. The expression can be inlined into the function body, which results in faster and more convenient code than C-style callbackfunctions. This technique can also be used to evaluatevector and matrix expressions in a single pass withouttemporaries. In preliminary benchmark results, one compilerevaluates vector expressions at 95-99.5% efficiency of hand-coded C using this technique (for long vectors). The speedis 2-15 times that of a conventional C++ vector class.

Introduction

Passing an expression to a function is a common occurrence.In C, expressions are usually passed using a pointer to acallback function containing the expression. For example,the standard C library routines qsort(), lsearch(), andbsearch() accept an argument int (*cmp)(void*, void*) whichpoints to a user-defined function to compare two elements.Another common example is passing mathematical expressionsto functions. The problem with callback functions is thatrepeated calls generate a lot of overhead, especially if theexpression which the function evaluates is simple. Callbackfunctions also slow pipelined machines, since the functioncalls force the processor to stall.

The technique of expression templates allows expressions tobe passed to functions as an argument and inlined into thefunction body. Here are some examples:


          // Integrate a function from 0 to 10
          DoublePlaceholder x;
          double result = integrate(x/(1.0+x), 0.0, 10.0);

          // Count the elements between 0 and 100 in a
          collection of int
          List<int> myCollection;
          IntPlaceholder y;
          int n = myCollection.count(y >= 0 && y <= 100);

          // Fill a vector with a sampled sine function
          Vector<double> myVec(100);
          Vector<double>::placeholderType i;
          myVec.fill(sin(2*M_PI*i/100));  // eqn in terms of element #

In each of these examples, the compiler produces a functioninstance which contains the expression inline. The techniquehas a similar effect to Jensen's Device in ALGOL 60 [1].Expression templates also solve an important problem in thedesign of vector and matrix class libraries. The techniqueallows developers to write code such as


          Vector y(100), a(100), b(100), c(100),
          d(100);
          y = (a+b)/(c-d);


and have the compiler generate code to compute the result ina single pass, without using temporary vectors.

So how does it work?The trick is that the expression is parsed at compile time,and stored as nested template arguments of an "expressiontype. Let's work through a simplified version of the firstexample, which passes a mathematical functionto an integration routine. We'll invoke a functionevaluate() which will evaluate f(x) at a range of points:



          int main()
          {
              DExpr<DExprIdentity> x;         // Placeholder
              evaluate(x/(1.0+x), 0.0, 10.0);
              return 0;
          }

For this example, an expression is represented by aninstance of the class DExpr<A> (short for doubleexpression). This class is a wrapper which disguises moreinteresting expression types. DExpr<A> has a templateparameter (A) representing the interesting expression, whichcan be of type DBinExprOp (a binary operation on twosubexpressions), DExprIdentity (a placeholder for thevariable x), or DExprLiteral (a constant or literalappearing in the expression). For example, the simplesubexpression "1.0+x" would be represented by the expressiontype shown in Figure 1. These expression types are inferredby the compiler at compile time.

Figure 1.

(A) Parse tree of "1.0+x", showing the similarityto the expression type generated by the compiler (B). Thetype DApAdd is an applicative template class whichrepresents an addition operation. In (C), the type name isshown in a non-expanded view. For this implementation, typenames are similar to a postfix traversal of the expressiontree.To write a function which accepts an expression as anargument, we include a template parameter for the expressiontype. Here is the code for the evaluate() routine. Theparenthesis operator of expr is overloaded to evaluate theexpression:


template<class Expr>
void evaluate(DExpr<Expr> expr, double start, double end)
{
    const double step = 1.0;

    for (double i=start; i < end; i += step)
        cout << expr(i) << endl;
}


When the above example is compiled, an instance ofevaluate() is generated which contains code equivalent tothe line


              cout << i/(1.0+i) >> endl;

This happens because when the compiler encountersx/(1.0+x) in


              evaluate(x/(1.0+x), 0.0, 10.0);

it uses the return types of overloaded + and / operators toinfer the expression type shown in Figure 2. Note thatalthough the expression object is not instantiated until runtime, the expression type is inferred at compile time. Aninstance of this expression type is passed to the evaluate()function as the first argument. When the fragment expr(i)is encountered, the compiler builds the expression inline,substituting i for the placeholder x. The mechanics of thisare explained further on.

Figure 2.

Template instance generated by x/(1.0+x): DExpr<DBinExprOp< DExpr <DExprIdentity>,DExpr<DBinExprOp<DExpr<DExprLiteral>, DExpr<DExprIdentity>,DApAdd>>, DApDivide>>As an example of how the return types of operators cause thecompiler to infer expression types, here is an operator/()which produces a type representing the division of two subexpressions DExpr<A> and DExpr<B>:


          template<class A, class B>
          DExpr<DBinExprOp<DExpr<A>, DExpr<B>, DApDivide> >
          operator/(const DExpr<A>& a, const DExpr<B>& b)
          {
              typedef DBinExprOp<DExpr<A>, DExpr<B>,DApDivide> ExprT;
              return DExpr<ExprT>(ExprT(a,b));
          }

The return type contains a DBinExprOp, which represents abinary operation, with template parameters DExpr<A> and DExpr<B> (the two subexpressions being divided), andDApDivide, which is an applicative template classencapsulating the division operation. The return type is a DBinExprOp<...> disguised by wrapping it with a DExpr<...>class. If we didn't disguise everything as DExpr<>, wewould have to write eight different operators to handle thecombinations of DBinExprOp<>, DExprIdentity, and DExprLiteral which can occur with operator/(). (More if wewanted unary operators!)

The typedef ExprT is the type DBinExprOp< ... , ... ,DApDivide> which is to be wrapped by a DExpr<...>.Expression classes take constructor arguments of theembedded types; for example, DExpr<A> takes a const A& as aconstructor argument. So DExpr<ExprT> takes a constructorargument of type const ExprT&. These arguments are neededso that instance data (such as literals which appear in theexpression) are preserved.

The idea of applicative template classes has been borrowedfrom the Standard Template Library (STL) developed byAlexander Stepanov and Meng Lee, of Hewlett Packard Laboratories [2]. The STL has been accepted as part of theISO/ANSI Standard C++ Library. In the STL, an applicativetemplate class provides an inline operator() which appliesan operation to its arguments and returns the result. For expression templates, the application function can (andshould) be a static member function. Since operator()cannot be declared static, an apply() member function isused instead:


          // DApDivide -- divide two doubles
          class DApDivide {
          public:
              static inline double apply(double a, double b)
              { return a/b; }
          };

When a DBinExprOp's operator() is invoked, it uses theapplicative template class to evaluate the expressioninline:


          template<class A, class B, class Op>
          class DBinExprOp {
              A a_;
              B b_;

          public:
              DBinExprOp(const A& a, const B& b)
                : a_(a), b_(b)
              { }

              double operator()(double x) const
              { return Op::apply(a_(x), b_(x)); }
          };

It is also possible to incorporate functions such as exp()and log() into expressions, by defining appropriatefunctions and applicative templates. For example, anexpression object representing a normal distribution can beeasily constructed:


          double mean=5.0, sigma=2.0;   // mathematical
          constants
          DExpr<DExprIdentity> x;
          evaluate( 1.0/(sqrt(2*M_PI)*sigma) * exp(sqr(x-mean)/
              (-2*sigma*sigma)), 0.0, 10.0);

One of the neat things about expression templates is thatmathematical constants are evaluated only once at run time,and stored as literals in the expression object. Theevaluate() line above is equivalent to:


          double __t1 = 1.0/(sqrt(2*M_PI)*sigma);
          double __t2 = (-2*sigma*sigma);
          evaluate( __t1 * exp(sqr(x-mean)/__t2), 0.0,
          10.0);

Another advantage is that it's very easy to pass parameters(such as the mean and standard deviation) -- they're storedas variables inside the expression object. With pointers-to-functions, these would have to be stored as globals (messy),or passed as arguments in each call to the function(expensive).

Figure 3 shows a benchmark of a simple integration function,comparing callback functions and expression templates tomanually inlined code. Expression template versions ran at95% the efficiency of hand-coded C. With largerexpressions, the advantage over pointers-to-functionsshrinks. This advantage is also diminished if the functionusing the expression (in this example, integrate()) doessignificant computations other than evaluating theexpression.

It is possible to generalize the classes presented here toexpressions involving arbitrary types (instead of justdoubles). Expression templates can also be used insituations where multiple variables are needed -- such asfilling out a matrix according to an equation.Since most compilers do not yet handle member functiontemplates, a virtual function trick has to be used to passexpressions to member functions of a class. This techniqueis illustrated in the next example, which solves asignificant outstanding problem in the design of matrix andvector class libraries.

Optimizing Vector Expressions

C++ class libraries are wonderful for scientific andengineering applications, since operator overloading makesit possible to write algebraic expressions containingvectors and matrices:


          DoubleVec y(1000), a(1000), b(1000), c(1000),
          d(1000);
          y = (a+b)/(c-d);

Until now, this level of abstraction has come at a highcost, since vector and matrix operators are usuallyimplemented using temporary vectors [3]. Evaluating theabove expression using a conventional class librarygenerates code equivalent to:


          DoubleVec __t1 = a+b;
          DoubleVec __t2 = c-d;
          DoubleVec __t3 = __t1/__t2;
          y = __t3;

Each line in the code above is evaluated with a loop. Thus,four loops are needed to evaluate the expression y=(a+b)/(c-d). The overhead of allocating storage for the temporaryvectors (__t1, __t2, __t3) can also be significant,particularly for small vectors. What we would like to do isevaluate the expression in a single pass, by fusing the fourloops into one:


          // double *yp, *ap, *bp, *cp, *dp all point into the vectors.
          // double *yend is the end of the y vector.
          do {
              *yp = (*ap + *bp)/(*cp - *dp);
              ++ap; ++bp; ++cp; ++dp;
          } while (++yp != yend);


By combining the ideas of expression templates anditerators, it is possible to generate this codeautomatically, by building an expression object whichcontains vector iterators rather than placeholders.Here is the declaration for a class DVec, which is a simple"vector of doubles". DVec declares a public type iterT,which is the iterator type used to traverse the vector (forthis example, an iterator is a double*). DVec also declaresStandard Template Library compliant begin() and end()methods to return iterators positioned at the beginning andend of the vector:


          class DVec {

          private:
              double* data_;
              int length_;

          public:
              typedef double* iterT;

              DVec(int n)
                  : length_(n)
              { data_ = new double[n]; }

              ~DVec()
              { delete [] data_; }

              iterT begin() const
              { return data_; }

              iterT end() const
              { return data_ + length_; }

              template<class A>
              DVec& operator=(DVExpr<A>);
          };

Expressions involving DVec objects are of type DVExpr<A>.An instance of DVExpr<A> contains iterators which arepositioned in the vectors being combined. DVExpr<A> letsitself be treated as an iterator by providing two methods:double operator*() evaluates the expression using thecurrent elements of all iterators, and operator++()increments all the iterators:


          template<class A>
          class DVExpr {

          private:
              A iter_;

          public:
              DVExpr(const A& a)
                  : iter_(a)
              { }

              double operator*() const
              { return *iter_; }

              void operator++()
              { ++iter_; }
          };

The constructor for DVExpr<A> requires a const A& argument,which contains all the vector iterators and other instancedata (eg. constants) for the expression. This data isstored in the iter_ data member. When a DVExpr<A> isassigned to a Dvec, the Dvec::operator=() method uses theoverloaded * and ++ operators of DVExpr<A> to store theresult of the vector operation:


          template<class A>
          DVec& DVec::operator=(DVExpr<A> result)
          {
              // Get a beginning and end iterator
          for the vector
              iterT iter = begin(), endIter = end();

              // Store the result in the vector
              do {
                  *iter = *result;   // Inlined expression
                  ++result;
              } while (++iter != endIter);

              return *this;
          }

Binary operations on two subexpressions or iterators A and Bare represented by a class DVBinExprOp<A,B,Op>. Theoperation itself (Op) is implemented by an applicativetemplate class. The prefix ++ and unary * (dereferencing)operators of DVBinExprOp<A,B,Op> are overloaded to invokethe ++ and unary * operators of the A and B instances whichit contains.

As before, operators build expression types using nestedtemplate arguments. Several versions of the each operatorare required to handle the combinations in which DVec,DVExpr<A>, and numeric constants can appear in expressions.For a simple example of how expressions are built, considerthe code snippet


          DVec a(10), b(10);
          y = a+b;


Since both 'a' and 'b' are of type DVec, the appropriateoperator+() is:


DVExpr<DVBinExprOp<DVec::iterT,DVec::iterT,DApAdd> >
operator+(const DVec& a, const DVec& b)
{
    typedef DVBinExprOp<DVec::iterT,DVec::iterT,DApAdd> ExprT;
    return DVExpr<ExprT>(ExprT(a.begin(),b.begin()));
}

All the operators which handle DVec types invokeDVec::begin() to get iterators. These iterators are bundledas part of the returned expression object.As before, it's possible to add to this scheme so thatliterals can be used in expressions, as well as unaryoperators (such as x = -y), and functions (sin, exp). It'salso not difficult to write a templatized version of DVecand the DVExpr classes, so that vectors of arbitrary typescan be used. By using another template technique called"traits", it's even possible to implement standard typepromotions for vector algebra, so that adding a Vec<int> toa Vec<double> produces a Vec<double>, etc.Figure 4 shows the template instance inferred by thecompiler for the expression y = (a+b)/(c-d). Although longand ugly type names result, the vector expression isevaluated entirely inline in a single pass by theassignResult() routine.

Figure 4.

Type name for the expression (a+b)/(c-d):DVExpr<DVBinExprOp<DVExpr< DVBinExprOp<DVec::iterT,DVec::iterT, DApAdd>>, DVExpr<DVBinExprOp< DVec::iterT,DVec::iterT, DApSubtract>>, DApDivide>>

Benchmark Results

To evaluate the efficiency of the expression templates approachto evaluating vector expressions, benchmark results wereobtained comparing its performance to hand-crafted C code, andthat of a conventional vector class. The performance wasmeasured for the expression y=a+b+c.

Figure 5 shows the performance of the expression templatetechnique, as compared to hand-coded C, using the Borland C++ V4.0 compiler and an 80486 processor. With a few adjustments of compiler options and tuning of the assignResult() routine,the same assembly code was generated by the expressiontemplate technique as hand coded C. The poorer performancefor smaller vectors was due to the time spent in theexpression object constructors. For longer vectors (eg.,length 100) the benchmark results were 95-99.5% the speed ofhand coded C.

Figure 6 compares the expression template technique to aconventional vector class which evaluates expressions usingtemporaries. For short vectors (up to a length of 20 orso), the overhead of setting up temporary vectors caused theconventional vector class to have very poor performance, andthe expression template technique was 8-12 times faster.For long vectors, the performance of expression templatessettles out to twice that of the conventional vector class.

Conclusions

The technique of expression templates is a powerful andconvenient alternative to C style callback functions. Itallows logical and algebraic expressions to be passed tofunctions as arguments, and inlined directly into thefunction body. Expression templates also solve the problemof evaluating vector and matrix expressions in a single passwithout temporaries.

References

[1] Rutihauser, H. Description of ALGOL 60, volume 1a of Handbook for Automatic Computation. Springer-Verlag, 1967.

[2] Stepanov, A. and Meng Lee, The Standard Template Library. ANSI X3J16-94-0095/ISO WG21-NO482.

[3] Budge, K. G. C++ Optimization and Excluding Middle-Level Code, Proceedings of the Second Annual Object-Oriented Numerics Conference, pp 107-121, Sunriver, Oregon, April 24-27, 1994.

Web Site

More information on these techniques is available from theBlitz++ Project Home Page, at "http://oonumerics.org/blitz/".


분류 번역 분류 프로그래밍


제일 위로
최종 수정 일시: 08월 11일(2002년) 10:27 AM 편집 | 정보 | 차이 | 비슷한 페이지 DebugInfo
유용한 페이지들: 분류 분류 | 자유로운 연습장 SandBox | 무작위 페이지들 RandomPages | 인기있는 페이지들 MostPopular