Recursive macros in C, demystified (once the ugly crying stops 😭)
In which it becomes clear, the C Preprocessor was designed by a Kafka fan
So you have heard rumors whispered between peers, that a rare few people somehow manage to make compile-time recursion work in C? And you want to have some insight into how that might be possible??
I should warn you, you’re risking your sanity… but I’ll indulge you.
Wait, did I really just say that? I must be a glutton for punishment, because the macro system is, by far, the thing I like least about C.
C has many advantages that have led to its longevity (60 years as perhaps the most important language). It has many quirks due to its age, most of which are easy to look past. But despite 30 years writing C, I still bristle at C’s macro system.
That’s not just because there are many languages (including C++) with more modern takes on compile-time execution. Macros appear simple, but have subtleties that make them poorly suited for anything other than light wrappers.
Still, being C’s only compile-time execution capability (currently), it is still both critical and important. Critical, in that many venerable critical systems heavily depend on them, and wouldn’t compile without them. Important, in that it’s often the only way to abstract out complexity that would lead to safety or security issues if exposed, such as automatically adding sentinels or static type checks.
C macros being hard to use does discourage their overuse. It’s easy for too much abstraction to make it too difficult for other people to maintain the code, so in some ways, as painful as they are, I can find some things to appreciate, and do occasionally find reason to use them for something non-trivial.
But it doesn’t take much for a macro to be non-trivial, because, while C macros can look like functions, they cannot be called recursively (at least, not easily, as we will see).
I have never been able to find out why recursion is limited in C macros. It could have started off intentional, but compile time execution wasn’t really on people’s minds then; the challenge was abstracting over many platform differences as cheaply as possible.
I suspect the system evolved as needed in the early days, without really thinking about it as something that perhaps should support recursion. Certainly at some point, the question would get raised; but it’s easy to imagine:
- The organic evolution of the macro system coupled with early success made it brittle, and hard to evolve.
- People were worried about build issues like hanging compile times due to infinite loops, or crashing with no diagnostics, due to infinite recursion.
Depending on which of these two was more prominent, I could equally imagine the lack of recursion being an accident, or being an intentional choice.
However it happened, that die was cast in a completely different era—people have definitely woken up to the value of pushing as much into compile time as possible.
Either way, it all seems archaic and unnecessary now.
So, let’s roll up our sleeves and learn to cope with the issue!
Motivation
If you have anything interesting you need done with the preprocessor, you probably need to generalize over multiple items.
Maybe you want to add automatic sanity checks around parameters, or add automatic type checking. You might want to pre-fill arrays, or generate a list of functions based on some data. Generally, we should be able to do such things at compile time, and in cases like type checking, it is often incredibly challenging to defer the work till runtime. So the lack of a good compile time solution for such things is problematic.
I tend to reach for macros when they can remove the potential for human error; for example, calling an API wrong. That can indeed be adding automatic casts, enforcing that null terminators get added on variable argument arrays, etc.
In all of those scenarios, we would need something to move macros toward Turing completeness at compile time. But macros do not advertise support for either of the things we’d be looking for there: iteration, and recursion.
Thankfully, we can get there. But it’s not going to be easy.
To frame our discussion, let’s pick a simple, but highly valuable goal. We’re going to build a macro that counts the number of variable arguments in a function-like macro that accepts variable arguments.
Why that problem? Because from there it’s a short jump to dealing with some common issues, where we can remove large sources of human error:
- Variadic functions (aka varargs) are error prone, because the implementation has to figure out where the arguments stop; the language doesn’t give you a way to know. The caller has to remember to follow your convention (like adding a null terminator). And that convention can often back fire, for example, when a null value is a valid argument . Being able to count the number of variadic arguments allows us to provide a single, general approach to dealing with this, and to not put the burden on the caller to have to count correctly.
- There are plenty of cases where we’d want to apply a transformation to each argument of a variable argument function, like statically checking that parameters are all the same type, or automatically adding a layer of sanity checking to a third party API, so that the people calling your function can remain blissfully unaware. These don’t directly require counting, but once we can count recursively, it’s a small change to give ourselves a more general purpose
mapconstruct to make such transformations.
You might be surprised that the language doesn’t provide a way to count variable arguments in a macro, especially if you noticed the recent addition of a pre-defined macro called __COUNTER__ in the draft C2Y standard. The new __COUNTER__ macro is intended to make it easier to provide uniqueness for situations where macros need to generate identifiers and labels; it definitely isn’t for counting variadic macro arguments.
Apparently math is hard?
If there’s no primitive to count variadic macro arguments, well, we need to create one, right? And if we have to create one, that means it should be pretty easy, one would hope?
🦗🦗🦗
I see you’re skeptical. But an optimist who wanted to delve into compile-time coding in C for the first time might give it a go. But they will quickly find that the obvious approach below, that feels like it should work, absolutely does not work:
// The first macro... counts one argument, then we'd like it to recurse.
#define _COUNT_ONE(x, ...) + 1 _COUNT_TOP(__VA_ARGS__)
#define _COUNT_TOP(...) __VA_OPT__(_COUNT_ONE(__VA_ARGS__))
#define COUNT(...) (_COUNT_TOP(__VA_ARGS__) + 0)
Try to call this code, and…
🤮
Yup, it’ll barf.
What this doe-eyed attempt is trying to do, is generate an expression (at compile time) of the form (+ 1 + 1 + … + 0), by iterating over each argument at a call site (via recursion). It doesn’t care what the arguments are, it just wants to generate a + 1 for each valid argument, and a + 0 at the end, both to make it a valid expression and to handle the case where there are no arguments passed.
If we’re successful, the addition will all happen at compile time, and will be what C calls an integer constant expression. That means, the compiler will, at compile time, fold this into a single static integer. So, even if it feels inefficient, there is no run-time cost involved.
Why three macros? That seems a bit excessive, right?
Unfortunately, C currently doesn’t have an easy way to do the equivalent of an if() statement at compile time. It’s possible to create something with macros, but that’s just extra hackery.
Instead, we split the primary body into its own macro— _COUNT_ONE(), which adds the 1 + and then triggers recursion (we wish, anyway; again, the recursion part won’t work this easily).
The intent the behind _COUNT_TOP() macro is evaluating an exit condition for our recursion. Specifically, we want to stop when we have no more arguments left in the function. The builtin macro __VA_OPT__() allows us to do exactly that— the text inside the parentheses gets expanded only if there are arguments. And when there are no arguments, the text inside the parentheses is discarded.
This gives us a lightweight way to separate the one argument case from the two argument case, without need for a kind of if statement; the 0-case won’t generate a recursive call. To combine this with _COUNT_ONE(), we’d need a primitive that allowed us to specify a replacement that only expands when there are no arguments, which doesn’t come out of the box.
The outer COUNT() macro could be factored out trivially; I leave it because it keeps what’s going on a bit clearer. This macro is the actual entry point, adds the ‘0’ a single time the end, and wraps the whole thing in parentheses, which helps avoid running afoul of operator precedence rules.
If we directly combine it with _COUNT_TOP() in the obvious way, it would compute the right thing, but it would do it by adding the + 0 after EVERY term, and parenthesizing the expression in a way that would come off as odd if you were looking at the resulting C code. For instance, we’d be aiming for a three argument function to generate:
( + 1 ( + 1 ( + 1 ( + 0 ) + 0 ) + 0 ) + 0 )
If we were to generate this code, it’d be ugly, but most people would declare success and move on. But unfortunately, the above code does not work, and will never work, no matter how many iterations of the standard are released between now and the heat death of the universe— changing the behavior would be too likely to impact plenty of real code.
For example, let’s attempt to use the above implementation like so:
#include <stdio.h>
int
main()
{
printf("COUNT() = %d\n", COUNT(1, 2, 3));
return 0;
}
With clang, I get:
tmp.c:8:30: error: expected ')'
8 | printf("COUNT() = %d\n", COUNT(1, 2, 3));
| ^
tmp.c:4:28: note: expanded from macro 'COUNT'
4 | #define COUNT(...) (_COUNT_TOP(__VA_ARGS__) + 0)
| ^
tmp.c:3:39: note: expanded from macro '_COUNT_TOP'
3 | #define _COUNT_TOP(...) __VA_OPT__(_COUNT_ONE(__VA_ARGS__))
| ^
tmp.c:2:32: note: expanded from macro '_COUNT_ONE'
2 | #define _COUNT_ONE(x, ...) + 1 _COUNT_TOP(__VA_ARGS__)
| ^
tmp.c:39:30: note: to match this '('
tmp.c:4:27: note: expanded from macro 'COUNT'
4 | #define COUNT(...) (_COUNT_TOP(__VA_ARGS__) + 0)
Wow, that’s a lot of error messages, saying little that makes sense.
I’m sure you can already get the feeling that, when you have a problem with your macros, it’s incredibly challenging to translate the resulting errors into what’s actually wrong. Here, it’s complaining about balancing parentheses, and with just a bit more complexity in our macros, it’d be incredibly easy for someone to spend 10 minutes trying to figure out where the parenthesis is missing, when no parenthesis is missing whatsoever.
Or, we could have taken advantage of the fact that C is perfectly happy to accept + + 0, and write 1 + instead of + 1.
Making that microscopic change, merely transposing two tokens, completely changes the error clang produces:
tmp.c:8:30: error: call to undeclared function '_COUNT_TOP'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
8 | printf("COUNT() = %d\n", COUNT(1, 2, 3));
Hey, at least that message is concise. Never mind that it’s totally different, and also unrelated to the real issue.
Not counting on it
The C Preprocessor (which I will usually call CPP) is responsible for macro expansion and processing lines with a leading #. As we’ve said, it does not fully support recursion. As you might expect, that’s the core of the actual problem in our first attempt. Yet, the preprocessor happily thinks it did its job. We’ll see in more detail what’s going on, but the crux of this particular problem is how recursion is disallowed, not that it IS disallowed.
The problem here is that C macros are their own programming language, being used to generate C code. The macro language doesn’t model most of the interesting parts of the language, and it is quite easy to produce code that the preprocessor finds acceptable, that the compiler cannot understand (as we will see).
In both these cases, the preprocessor feels like it’s done its job, and passes off its work to the C compiler. The C compiler gets the generated code, and has no idea that macros were used. It calls the error as it sees it.
This disconnect between the preprocessor and the compiler is one of the things that makes macros in C so unfriendly.
If a macro expansion (basically the same as an ‘evaluation’) is recursive, the CPP decides “they can’t possibly have wanted recursion here, because that might loop forever, so this must be plain old text I have to substitute”. As a result, the emitted code will still contain the unexpanded macro.
How can we confirm this? If we can’t understand the resulting transformations, we’re going to end up stumbling around in the dark.
Many developers don’t know how to see what the C preprocessor actually produces. If the preprocessor successfully exits, we can see it’s output by stopping the compiler after the preprocessing phase, generally with the -E flag. If we don’t give a file name (via the -o flag), we should see the results on the terminal. And at least in the case of clang, we will even get output up to the point that we did something so wrong that the preprocessor gives us an error.
For the code above, running cc -E tmp.c works without errors, and dumps the output of CPP to my terminal.
That consists of a lot of stuff you might not expect to see. The output contains our code after the preprocessor has fully expanded it. But that full expansion includes the results of it preprocessing all of the header files we pulled in, which in our case was stdio.h, and any cascading dependencies it might have.
However, our code is easy to find in that noise. The last thing output will be our fully translated main() function, ready to be input into the C compiler. For the case where we add +1 at the beginning of the macro, we will see:
int
main()
{
printf("COUNT() = %d\n", (+1 _COUNT_TOP(2, 3) + 0));
return 0;
}
Here, the compiler doesn’t have to try to look up the symbol _COUNT_TOP; it knows that it doesn’t make sense to have a function call after a number with no operator in between.
When we reverse the + and the 1, the line of code is valid, as long as there’s a function C can resolve called _COUNT_TOP. Because there isn’t, the compiler bails.
That explains why we get two different errors, for such a minor change.
Because CPP and the compiler itself are oblivious to the execution of the other, and because we have to live with the fact that recursive macros aren’t errors to the CPP (silently passing them through unexpanded), it’s quite a bit of work for any compiler to even try to tell you that your problem here is attempting to use recursion in a macro. It could be done, and maybe it should be done, because otherwise, compilers are effectively trying to gaslight you into believing you have a syntax error of some sort.
Because the preprocessor is much more permissive than the compiler, without the compiler having any awareness of macros even existing is perhaps the most significant reason why writing non-trivial macros is so hard to do.
Again, we can hack our way around the recursion problem. The semantics are arcane and intricate enough that, even knowing the rules (and doing macro work with a copy of the standard at the ready), macro development is incredibly challenging the second you have any problem at all. Decades later, I often feel like I’m stumbling around in the dark when a macro I write blows up on me.
If this is all too intimidating, absolutely we can plagiarize our way to success. Though, personally, I really prefer not to cut-and-paste code from Stack Overflow, especially if it’s code I don’t understand. Similarly, while Claude and I are casual acquaintances, I do not tjava his code. It’s my unwillingness to using code I don’t understand in production that keeps me learning and growing. Instead, I avoid non-trivial macros, unless (as I said above), I make an exception when they will be a huge net positive for helping the developer, usually by removing potential failure modes, or with significant clarity improvements. Meaning, if you can use them to provide an abstraction that makes the code more robust, and is also not going to be hard to maintain if the need arises, then I’d consider it (even if you have to get Claude to write it). Here, automating size detection statically feels like a good enough use case for my tastes, because macros will not only make variadic functions easier to write, but also make it far easier to call them correctly.
So, I’d like to help those interested understand how to navigate through the pain, and shine a light on it, in the hopes that this is another area of the language that the modern standards committee can make massively better.
That doesn’t count
If we want to know how to circumvent the recursion restriction, we probably need to understand the detection mechanism we are attempting to evade.
It sure would be nice if we could debug by having our compiler give us intermediate expansions, up until the point that it breaks. This is not directly built into any compiler as far as I know. And my experience with macro debuggers has been that they have a hard time matching compiler semantics. I dusted one of them off when working on this article, and it was easy to get it to expand macros as valid that CPP barfed on, and vice versa. Despite the lack of tooling, I’ll walk through the expansion process in detail so we can all understand.
The rules for C macro evaluation are hard to explain in a way that’s simultaneously precise and clear. But for function-like macros, the main process of evaluating a macro boils down to:
- Replace the macro text. Stashing aside the arguments used to call the macro, we replace the full macro with the textual body, within the larger token stream we’re processing.
- Add placeholder tokens. Instances of ‘arguments’ in the body get replaced with placeholder tokens, to prevent them from being evaluated as macros in any nested argument expansion we might have. This includes
__VA_ARGS__and__VA_OPT__()invocations. - Evaluate preprocessor operators in the body that take operands , particularly
#and##. We don’t make good use of these operators in this article, so we won’t cover in too much depth. Note, however, that__VA_OPT__()is also a preprocessor operator that takes arguments. We inhibited expansion within the replacement text, when we were evaluating operators, but at the end of this phase, we put it back; it can get expanded in the next step. - Rescanning the body. The body is then scanned for more macros to expand, starting from our cursor in the token stream. The “input” head moves forward a token at a time, until we mind a macro to expand, or reach the end of the macro we’re evaluating. When we find a macro to expand, we recursively apply the algorithm.
There are some pretty large subtleties here. As long as a macro invocation starts in the scope of a rescan, the scanning head position can move past the end of the original macro. For example, consider this basic scenario:
#define CONCAT(X, Y) X ## Y
int PRINT_INT = 100;
int
main()
{
printf("%d\n", CONCAT(PRINT_, INT));
}
The ## operator, seen used in the CONCAT macro, appends two tokens together, turning them into a single preprocessor token.
When the preprocessor evaluates CONCAT(), it will result in the token PRINT_INT. If PRINT_INT were a macro, the preprocessor would do further expansion. But it is not, so the preprocessor outputs PRINT_INT. The C compiler does not complain, because it sees a variable named PRINT_INT.
So far, depending on your background, the semantics may or may not being intuitive. But either way, what do you think should happen in this slightly more complex scenario?
#define CONCAT(X, Y) X ## Y
#define PRINT_INT(N) printf("look an integer %d\n", (N));
int PRINT_INT = 100;
int
main()
{
printf("%d\n", CONCAT(PRINT_, INT));
CONCAT(PRINT_,INT)(100);
}
It would be reasonable to think there’s an error here, but this code will compile and run. Why? In both cases, the preprocessor will generate the token PRINT_INT. In the first case, everything will happen the same way it did in our first example, and the compiler will see the variable PRINT_INT.
But, with the second use of CONCAT, the preprocessor will see that there is a parenthesis immediately following the token PRINT_INT. Since it has a function-like macro with that name, it will prefer the macro interpretation.
That’s true, even though we didn’t directly write PRINT_INT in the code, there. The effective result of the preprocessor’s expansion would look like this:
#define CONCAT(X, Y) X ## Y
#define PRINT_INT(N) printf("look an integer %d\n", (N));
int PRINT_INT = 100;
int
main()
{
printf("%d\n", 100);
printf("look an integer %d\n", 100);
}
That’s because C’s macros come in two flavors, with slightly different semantics:
- Function-like macros, which take arguments, and syntactically LOOK like functions. As we see here, if the preprocessor sees a token with the same name as a function like macro, but it’s not used like a function like macro, it will pass it through, letting the C compiler resolve the token.
- Object-like macros, the definitions of which look like variables, and do not take arguments. If the C preprocessor sees a left parenthesis after an object-like macro, that token will just be passed through directly to the compiler.
That is why the following code does NOT error. In fact, it runs quite happily:
#define OBJECT_LIKE_MACRO printf
#include <stdio.h>
int main()
{
OBJECT_LIKE_MACRO("Hello, world!\n");
}
Meaning, if we we have an object-like macro and it looks like we’re trying to call it, the C preprocessor isn’t going to call it. It’s just going to pass the token through, and let the compiler figure it out.
As you can see, the preprocessor’s philosophy is to do its job, and nothing else.
Another preprocessor subtlety that’s easy to miss, yet important to understand is:
| ℹ️️ | Arguments to function-like macros are expanded at the call site. For expansions that trigger inside the body where those arguments are used, the contents of the expanded arguments will not be available to be part of any additional expansion. |
The preprocessor’s approach of substituting arguments with placeholder tokens during evaluation is pretty effective at stopping a bunch of accidental recursion that would be non-intuitive. Although, it’s not the problem for the recursion we’re trying to solve. The barrier we’re hitting is a subtlety that we haven’t discussed yet, but we’ll get to it soon enough.
Before that, let’s solidify our understanding by walking through the relevant steps with our invocation of COUNT(1) .