#include “irony”

The other day, I had this great (read: stupid) idea: I wanted to write a Perl script to process a .c file, find all the #include directives, and replace them with the included file’s contents. I wanted to do this recursively, such that the resulting .i file would have zero #include lines.

Notice that this is not the same as just running the C preprocessor on the file. That would have processed lines like #ifdef _AMD64_ and resulted in a non-portable .i file. All I wanted to do, was to “flatten” a complicated program down to one (still-portable, possibly enormous) file.

Please take my word for the fact that I have a totally reasonable justification for wanting to do this. Honestly. I could convince you, but it would take another page of text.

I wrote the Perl program in about 20 minutes, and turned it loose on my .c file.

A few minutes and 26 megabytes of .i file later, I decided the program would never terminate. That’s about the time it hit me: include guards.

In CS-nerd speak, all the files in a C program form a directed graph where the edges represent inclusion. This graph contains cycles. That means that A.h can include B.h which can include A.h, and so forth, until your brain explodes. You typically fix this problem with so-called include guards:


#ifndef __FOO_H
#define __FOO_H
//... text of foo.h goes here ...
#end

Beginner C/C++ programmers often learn about include guards the hard way. For me, they are reflexive — which is why I totally forgot about them.

So here’s the irony: I want only to process #include directives, and not #define directives, but I can’t sucessfully do the former without doing the latter too.

Advertisements

5 Responses to “#include “irony””


  1. 1 David Bremner October 26, 2005 at 1:09 pm

    Mark,

    have you tried using gcc -save-temps?

  2. 2 Mark October 27, 2005 at 2:16 pm

    David,

    No, I haven’t, but mostly because I need to #include “windows.h” , etc, and I have no idea how to use MinGW or Cygnus to do this with gcc on Windows.

    From reading the man page, it sounds rather similar to VisualC’s -P switch. In other words, it gives me the preprocessed output file. What I wanted, but later discovered was impossible, was to only process #include directives, and leave all the #ifdef stuff alone.

    -Mark

  3. 3 molo October 28, 2005 at 1:49 pm

    This can’t be that hard. For each #include, get the inode. If the inode is in an existing hash, replace the #include by a blank line. If the inode is not in the hash, this is the first time you’re seeing it and can safely #include it.

    -molo

    PS: had problems with spam bots posting here? whats with the capcha?

  4. 4 Mark October 28, 2005 at 6:06 pm

    Chris,

    No, it’s impossible. What you suggest assumes that nobody will ever purposely #include something twice. However, this is not true. For example, consider this foo.h:

    #ifdef MAKE_FOO_BE_BAR
    # undef MAKE_FOO_BE_BAR
    # define foo bar
    #else
    # undef foo
    # define MAKE_FOO_BE_BAR
    #endif

    Now imagine a foo.c that includes this multiple times:

    #include <stdio.h>
    void foo() { printf( “foo\n” ); }
    void bar() { printf( “bar\n” ); }
    int main()
    {
    #include “foo.h”
    foo();
    #include “foo.h”
    foo();
    #include “foo.h”
    foo();
    }

    This prints:
    foo
    bar
    foo

    Admittedly, this is a contrived example. But, I’ve heard of at least one assert.h which used a similar trick such to cause multiple includes to toggle assertions. (My system’s assert.h does *not* exhibit this behavior.)

    -Mark

  5. 5 Mark October 28, 2005 at 6:07 pm

    Oh, and yes, I was getting a lot of comment spam, so I added the captcha.


Comments are currently closed.




%d bloggers like this: