Tutorial: Demystifying GCC

This tutorial was supposed to be taught by Morgan Deters, Kenneth Zadeck, and Ron Cytron. For various reasons, the talk ended up being just Morgan. This was somewhat disappointing, because Ron and Ken are co-inventors of SSA.

One additional note: this was designed as a 6 hour talk, but was reduced to 3 hours without changing the scope of the talk. (Uh oh)

I sat near Kang Su Gatlin (a familiar face from Microsoft), as well as Arno Hasse from SE-Radio.net.

Morgan began by explaining what gcc is, and is not. It is a collection of compilers, for many languages. It is not an assembler, nor is it packaged with one (see binutils). Gcc produces assembly which must be consumed by an assembler to be of much use.

Morgan extolled the virtues of using gcc as a research compiler. He thinks the advantages outweigh the disadvantages. I find that this is an interesting contrast to Microsoft’s Phoenix project, which will probably have better internal design, but is also more closed.

Next we took a look at the high level structure of the compiler. The gcc program itself is, of course, just a driver (as are g++ and gcj). These drivers call the real compiler, assembler, linker, etc.

Inside the compiler, after parsing we have so-called generic trees for a while. (Eventually we end up with GIMPLE trees. More on this later.) There are a bunch of tree optimizations, many of which are language-neutral. Later, the expander is run which creates register transfer language (RTL). RTL is not language specific, but begins to be architecture-specific. Some further optimization is done on the RTL, then it gets further lowered, and ultimately instruction selection is done.

Next Morgan walked us thru a very helpful explanation of how to build the gcc source. The main requirement is another c compiler for bootstrapping.

As with many Unix apps, you start by kicking off a ./configure. This takes some important options, two of which Morgan highlighted.

  1. –enable-languages: takes a comma separated list of languages to build
  2. –enable-checking: adds extra internal consistency checks (a good idea if you plan to modify things)

In the same vein, Morgan suggested building gcc with -O0 -ggdb3 (no optimization, maximum symbols), to make debugging easier.

Gcc has a three-stage bootstrap as follows:

  1. Compiles a minimal version of itself, using host c compiler
  2. Builds a full gcc using the minimal version from step 1
  3. Rebuilds itself once more in typical compiler-sanity-check style

In practice, you won’t need step 1 if you are using a recent gcc host compiler to begin with. In fact, invoking make with no parameters will do just this. If you are bootstrapping with a non-gcc host compiler, it’s probably a good idea to invoke make bootstrap. Another possibility is to use make profiledboostrtap which apparently builds the compiler with dynamic, feedback-driven, optimizations.

Note: if you happen to be building a cross-compiler, Morgan highly recommends Dan Kegel’s Crosstool. (I got the impression that Morgan’s GCC expertise was born from his work compiling Java to embedded PPC with real-time GC constraints.)

Morgan suggested the following useful driver options for debugging. He also suggested looking at info gccint, which is home to lots of gcc internals documentation (although is slightly out of date).

  • -E: show preprocessed output
  • -S: spit out assembly
  • -H verbose header inclusion
  • -save-temps
  • -v: verbose (very useful)

When hacking gcc one of the first practical problems encountered is: how do I add a compiler switch to enable my stuff? Morgan explained the procedure. In the gcc dir, the options.opt text file defines the command line options for the driver. This is a simple text file, with sets of three-line stanzas for each switch. The first line is the flag name, and the third is a text description of the flat. The middle line indicates what global compiler variable is affected by the switch. It’s ugly but, hey, it works.

Additionally, Morgan showed us “spec” files, which is how gcc handles interactions between multiple command-line switches (for example, to complain when incompatible options are used simultaneously). Also, the spec file controls how the driver command line arguments are munged and passed on to the actual compilers.

The individual compilers have their own switches. In general, look for .opt files to see the options (there are different files for language-specific options)

After a quick coffee break, we began to delve into the front-end. Some (many?) gcc-supported languages have a bison-generated parser (see .y files). However, some may have a C parser that appears to have been generated once upon a time (C++?). The Java front end is in gcc/java, and its entry-point is java_parse_file.

Morgan informed us that all the front-ends are written in C, which is unfortunate since the tree-structures in the front end would be a good place to use object-oriented programming. We’d like to have a tree node base class with subclasses for expressions, etc. But, since we have only C, gcc employs discriminated unions (tree_node), so there is no real type safety (although the –enable-extra-checks thing apparently helps here).

In the source, tree is simply a typedefd pointer to a tree node. The tree.def file includes definitions for all the different kinds of trees which are supported.

As a final front-end note, Morgan pointed out some useful front-end functions:

  • build: makes a new tree node
  • fold: Simple restructuring and optimization
  • gcc_assert: What you expect.

Morgan said that the fold function, above, does some simple constant folding. I thought this was a little odd to do this in the front end. I asked if this applied to floating-point numbers (which would require a target-accurate floating-point simulator) and Morgan indicated that he believed that fold only did architecture-neutral stuff.

One very useful hint from Morgan: to add new language support to gcc, look at the “treelang” front-end. This is a simple tree-based language and it’s a good example / starting point.

After syntax checking, raising errors, etc, the front-end proceeds to gimplification. Technically gimplification happens between the front-end and the middle-end. (Yes, gcc has an oxymoronic middle-end). Apparently GIMPLE is based on SIMPLE. I’m not entirely sure, but I think this is where we go from a tree-based form into a linear form. See Morgan’s slide 53 for more details (link at the bottom of this article).

The middle end has tree optimizations, expansion into RTL, some RTL optimization passes, and register allocation. The RTL portions of the compiler are much older than the tree parts.

Morgan showed us one particularly important file, passes.c. This controls the ordering of passes over the tree representation.
I guess all this tree stuff is the “tree ssa” thing that is fairly new in gcc. See all the files with prefix “tree-ssa” for the juicy details. (Author’s Note: I’m a bit confused here, because I thought we were already out of tree-world and into linear-land. Maybe not.)

In the passes.c file, tree_opt_pass describes a particular pass, what it requires, destroys, produces, etc. To debug tree passes, see the -fdump command line arguments: -fdump-tree-(name of pass), or -fdump-tree-all. This will create lots of output files like foo.c.t87.dse2, where the “t number” is the sequential pass number. Output is done after the pass, so diff against the previous number to see changes.

At this point Morgan described SSA and referenced the original paper. I didn’t realize that this idea originated in functional languages, but apparently it did. There was a realization that some optimizations are easier on functional languages because variables are never mutated. (In other words, they are purely single-assignment.) The SSA idea was to generate so-called phi nodes, which indicate the merger of singly-assigned variables, but do not result in any code generation.

Boy do I wish one of the SSA originators was here for this part of the talk.

In gcc, the important SSA phases are: pass_build_ssa, and pass_del_ssa. According to Morgan, SSA-based optimizations are done between these two passes.

When looking at the source files in the gcc directory, there is a useful naming convention to be aware of. Files prefixed with “tree-” are all the optimizations which run on trees. If, however, the prefix is “tree-ssa-“, these are passes referenced above which require SSA to function.

With 10 minutes left, we got to back-end, RTL stuff.

Morgan explained how the RTL optimizations can be broken in half. Many run prior to register allocation on pseudo registers. Then RA runs. A handful run after register allocation.

The machine description facilitates the transition to RTL and eventually assembly. It has two parts. The first essentially says, please map this standard gcc name (ie, jump) to some RTL (set pc). The second maps the RTL to an assembly language like string (for the .s file, I guess).

As you can probably tell, there was way too much detail in this talk to be crammed into 3 hours. Still, Morgan did a good job and gave me a decent overview of how gcc works internally.

Link to Slides


%d bloggers like this: