Does your compiler handle indirect function call folding and inlining?

(Nhat Minh Lê (rz0) @ 2012-06-09 10:16:32)

Every once in a while, I ask myself this question: does my C1 compiler correctly optimize it? And today, it is indirect function call inlining. I usually poke around the output of various compilers I use, every few months or so, to see how their understanding of my code has evolved… but this time around I decided to write something a bit more structured so I could share it with whoever might be interested in the same data.

You can find the code and listings on my Git page and the HTML-formatted results on its own web page. But what is this all about?

1 This whole test was designed for C and C compilers. Though it is also probably relevant to C++ to the extent that most of the tested compilers also process C++, the test code and patterns are not what you’d typically find in C++. Hence, how meaningful this information may be to the average C++ programmer is questionable.

What is being tested

The tests exercise various scenarios involving inlining in the presence of indirect function calls through function pointers. Inlining can occur at different levels: for the outer function being directly called, and for the function arguments called from within the first inlined function. To make this completely clear, let’s look at an example:

static inline int foo(int (int));
static inline int f0(int);

static inline int
foo(int f(int))
{
    return f(42);
}

When foo(f0) is called somewhere, and if the compiler decides to inline it, then it also gets an opportunity to fold and inline f0 at its call site.

I won’t go into the details of what each test does; you should really read the description on the results page and then the code for yourself. Things to look out for:

  1. Inlining of direct calls (outer function and subsequent calls in the direct chain between call_* functions). This should be trivial.

  2. Folding of known function constants. This is where things are likely to go wrong. The tests exhibit various ways to pass the function pointer constants around: as arguments or in a constant structure (compound literal or const variable); with explicit typing or hidden behind a void pointer.

  3. Elimination of indirect calls (either into a direct call or inline call when possible), after the folding pass. This is not difficult per se once constant propagation has been done, but some compilers "forget" to do so.

One thing that has not been tested is indirect call chaining, as it is really not that common in C (maybe due to the cumbersome syntaxe associated).

Why it matters

A rather difficult question, really. Truth be told, it may not matter to you. The kind of scenario described here most prominently arises when inline functions are used for their generative properties, rather than simply to make some call faster.2

For example, you could imagine having two versions of qsort: the default one, which exists as an external definition, and a specializable one, which expands inline. Defining specialized versions of qsort would then be a matter of wrapping the inline call in a definition of your own.

Whether this is "good", "bad", "interesting", or downright "awful", is for you to decide. But if, like me, you think that this may have fun applications, then you should check first that whatever compilers you wish to build with do the job. And this is why I wrote this little test.

2 In a sense, as opposed to how we used to write macros to get "faster functions", we can now — with enough compiler support — write inline functions instead of macros to generate code. Inlining replaces macro expansion, and constant propagation does a similar job to argument substitution.

Conclusions

From the results, I think we can draw a few general conclusions:

Overall, if you want to cater to the widest range of (optimizing) compilers, keeping to immediate parameters is the safest choice. For modern C99 compilers only, I believe compound literals will become (if they’re not already) a viable alternative in the near future.