Archive for the 'C++' Category

Templates are the Gateway Drug for Macros

If you’ve read a half-inch-thick C++ book in the last few years, you’ve probably heard that you should never use macros. Instead you should use inline functions and templates (with a few exceptions, such as use of __FILE__ and __LINE__.)

Well, I’m here to say that there is nothing in C++ which will drive someone to write more macros than templates.

Templates merely whet my appetite for compile-time code-generation, and then almost always fall short. Ultimately, my devotion to the DRY principle is stronger than my allegance to Sutter or Meyers — so I write a macro and get on with my life.

Surprising Effects of Volatile Qualifier

The classic semantics of C/C++ volatile are: “Please Mr. Compiler do exactly the reads and writes I specify.” Consider the following program:


   int x = 0;
   void foo( int c )
   {
      while( --c )
         x += c;
   }

In this case x is just a regular integer, so the compiler is free to register allocate and thus generates this:


      sub     ecx, 1
      je      end ; zero-trip loop
      mov     eax, DWORD PTR x
   looptop:
      add     eax, ecx
      sub     ecx, 1
      jne     looptop
      mov     DWORD PTR x, eax
   end:

Notice how x is loaded before the loop begins, allocated to the register eax, and has its final value stored back to memory after the loop. Now let’s see what happens when x is qualified as volatile:
Continue reading ‘Surprising Effects of Volatile Qualifier’

Stop Vim From Indenting Public/Private/Protected

It always bugged me that Vim indents the code following public, private, or protected. It seems like a waste of space. Here’s how to turn it off.

Assuming you are already using cindent, simply add this to your vimrc:


    set cinoptions=h0

See the official vim indent documentation for full details.

VLA Redux

After discovering C99’s Variable Length Array feature, I was discouraged by the apparent lack of a C++ equivalent. To review, a VLA has a dynamic size at creation time, but is otherwise fixed. This allows it to use stack memory for blazingly fast allocation and deallocation. The fully-dynamic, growable, heap-allocated std::vector is heavyweight by comparison.

In an ideal world, C++ would have a VLA template class which would confer all the benefits of C99’s VLA, plus additional type safety and (for the love of god) a .size() accessor.

In wrestling with this, I stumbled on to the idea of putting alloca() into a forceinline function. Depending on your mood, this is either a clever trick or a fragile hack.

Let me enumerate the problems here:

  1. The alloca() function is not standard.
  2. Neither is forceinline.
  3. If the compiler cannot forceinline your alloca function, it’s a bug.
  4. Compilers can never guarantee inlining of arbitrary functions (recursion, etc).

That being said, here’s a nascent implementation which seems to work on gcc and VC++:


#include <new>
#include <stdlib.h>

#if defined( __GNUC__ )
   #define FORCEINLINE inline __attribute__ ((always_inline))
   #define ALLOCA alloca
#elif defined( _MSC_VER )
   #define FORCEINLINE __forceinline
   #define ALLOCA _alloca
#endif

template< typename T >
class dynamic_array
{
   private:
   T* elems_;
   std::size_t size_;

   public:
   FORCEINLINE explicit dynamic_array(std::size_t size) :
      size_(size),
      elems_(static_cast<T*>(ALLOCA(sizeof(T)*size)))
   {}
   T& operator[](std::size_t x) { return elems_[x]; }
   std::size_t size() const { return size_; }
};

Now we can say things like:


template< typename A >
void fib(A array)
{
   array[0] = 0;
   array[1] = 1;
   for( unsigned i=2; i<array.size(); ++i )
     array[i] = array[i-1] + array[i-2];
}

void both(size_t size)
{
   dynamic_array<int> d(size);  // cheap
   std::vector<int> v(size);
   fib(d);
   fib(v);
}

If we had a caller_alloca() or the ability to write if( inline ), we might even be able to do this safely.

Kevin and I talked about this idea at length and, being the uber-coder that he is, he whipped up a full-fledged version here. Check it out.

Herb Sutter’s Concur @ NWCPP

Last week NWCPP hosted Herb Sutter (and I cajoled Kevin into bringing his camera). Herb talked about the Concur project, which is a handful of language and library extensions to dramatically simplify correct concurrent C++ programming. Slides Here.

Here’s some video that makes my blog look cool on Google’s bandwidth dime:

Variable Length Arrays

Last week, my buddy Kent sent some mail, lamenting the fact that this code didn’t work quite as well as the traditional countof macro:


template <typename T, unsigned N>
inline unsigned countof( T const (&)[N] ) { return N; }

int main()
{
        int a[10];
        int b[countof(a)]; // BOOM!
}

The heart of the matter lies in the fact that the standard does not allow this:


int array[foo()];

We simply can’t force compilers to stop and analyze foo — we can’t standardize the optimizer. Some might do better analysis and might prove more than others, leading to a portability nightmare. Furthermore, it’s not guaranteed that the foo’s definition will even be available at this point. We can’t require whole-program optimization (aka, LTCG).

It all made sense until Bheeshmar chimed in with the following shocker: “It works on GCC”. I was confused until I realized that GCC already supports C99-like variable length arrays.

Normal C arrays are terribly stupid. Their size must be known at compile time (yet they manage to completely forget this information after their point of definition). VLAs don’t have this restriction because they simply do an alloca-style stack allocation at runtime.

OK so VLA support enabled Kent’s code to build. But wait, Kent’s array has a compile-time constant size. What’s going on here?

With VLAs, we can define arrays without fear. The compiler thinks to himself, “No big deal, I can call alloca() later if I must.” Because of this safety net, compilation can proceed to the optimizer which may eventually prove that the array actually has a compile-time constant length. Eventually, a smart optimizer might turn the VLA back into a regular ‘ole static array, saving some stack manipulation.

Let me say that again in case you missed it: VLA support enables us to discover that we don’t need VLA support.

Console Resize Utility for Windows

For you command-line junkies, I present size.exe. You can use it to resize a console window programmatically. Here’s an example:

  C:\scripts >type vimdiff.cmd
  @echo off
  size 60 120
  vim -d %*
  size 60 80

I wrote this in C++ using Boost. If you’re interested, the source is available here.

Update 4/8/2008:
Turns out that Windows actually does have built-in support for this, it’s just hidden in a dusty corner as usual. Good thing, too, because my program didn’t work worth a damn. Here’s the official method:

   mode 120,60

(Credit to Sahil Malik)

More Musings on Versioned C++

A couple of weeks ago I was bemoaning the lack of a C/C++ analog to Perl’s require statement. Today, I realized we’ve already solved this problem. Just google for __GNUC__ or _MSC_VER to see what I mean.

Let me back up for a second. Do you have copies of the C and C++ language specifications? Which versions? More to the point, I’d bet $10 you don’t really understand them (I sure as heck don’t).

But it’s OK — it doesn’t matter. Despite the noble efforts of our beloved standards body, we are only truly beholden to the de facto standard that is our compiler and library implementation.

Simply moving from the ACME C/C++ 8.0 to ACME C/C++ 8.1 may well require so many source changes as to be considered porting. And don’t feel safe just because you compiled and linked! We’ve got to test the whole shebang all over again. The newsgroups are rife with tiny snippets of code which are enough to invoke the specter of “undefined behavior.” Not to mention the fact that compilers have bugs too. Yes, even your code could be the next addition to ACME’s compiler QA suite!

So why fret over backwards compatibility? People with existing source code have existing compilers and libraries. They can just continue using them. Case in point:


   mark@turing ~ $ file `which gcc`
   /usr/bin/gcc: symbolic link to `gcc-4.0'
   mark@turing ~ $ ls /usr/bin/gcc-*
   /usr/bin/gcc-2.95
   /usr/bin/gcc-3.2
   /usr/bin/gcc-3.3
   /usr/bin/gcc-4.0

Maybe we can just forget this whole versioning thing and get on with improving C++.

Memoize.cpp

Here’s my attempt at writing a higher-order-function in C++ which allows one to memoize a function. It’s not complete. It doesn’t work for functions with signatures much more complicated than int fib(int).

Perhaps most disappointing is that there appears to be no way to generically memoize the recursive calls, the way Memoize.pm can.


#include <boost/function.hpp>
#include <map>
#include <iostream>

using namespace std;

template
struct memoized_function;

template
class memoized_function : public boost::function
{
   private:
   mutable std::map<a> m;
   typedef boost::function super;

   public:
   template
   memoized_function( const T& x ) : super(x) {}

   R operator()(A& a) const
   {
      if( m.find(a) != m.end() )
         return m[a];

      m[a] = super::operator()(a);
      return m[a];
   }
};

template
memoized_function memoize( const T& f )
{
   return memoized_function(f);
}

int fib( int n )
{
   cout << "called fib(" << n << ")\\n";

   if( n < 2 )
      return n;

   return fib(n-1) + fib(n-2);
}

int main()
{
   cout << fib(5) << endl << endl;

   typedef boost::function func;
   func fastfib = memoize( fib );
   cout << fastfib(5) << endl;
   cout << fastfib(5) << endl;  // just a look up
}

Memoize.pm

Wow, Perl’s Memoize.pm is cool. Try this:


   use Memoize;
   sub fib {
      my $n = shift;
      print "called fib($n)\\n";
      return $n if $n < 2;
      fib($n-1) + fib($n-2);
   }
   print fib(5) . "\\n\\n";
   memoize( 'fib' );
   print fib(5) . "\\n\\n";

It does two different things:

  1. It wraps the function with a caching version.
  2. It replaces all calls to the function with calls to the memoized version. This includes the recursive calls inside the function itself.

It does all of this without requiring any source-level changes to the memoized function. This makes it easy to accelerate any referrentially-transparent function. Just don’t do something dumb like memoize print() or time().

With C++ I can mimic #1, but not #2. It’s a shame, because #2 is the cool part. With my C++ implementation, I only speed up identical repeated calls:


   fib(5);  // slow: makes 14, mostly-redundant, recursive calls
   fib(5);  // fast: purely a look up
   fib(4);  // slow
   fib(6);  // slow

Perl does much better:


   fib(5);  # fast: makes 5 recursive calls
   fib(5);  # fast: purely a look up
   fib(4);  # fast: purely a look up
   fib(6);  # fast: picks up where fib(5) left off

I’m somewhat bugged that I couldn’t duplicate this behavior with C++. Perl needn’t reach into it’s “dynamic language” bag of tricks to achieve this. It just provides access to the symbol table:


   sub a() { print "a\\n"; }
   sub b() { print "b\\n"; }
   *a = \\&b;
   a();     # prints b