boring c++ porn 

After this discussion with my arch-nemesis form functional.cafe, where I made some bold claims about what's possible in C, I went down the rabbit hole of tail/sibling calls.

Found a few discussions in compiler mailing lists, where the consensus seems to be that they are happy to add something explicit in the backend but not is C frontend, since it makes little sense there, so I decided to see how far can I push my idea and if it can be viable.

TLDR of my idea, if you want to jump into a procedure/function without storing the return address in the stack, just call it and if there is nothing to return to, nobody will be returning. The challenge is to make sure of this of course, without explicit syntax.

In C it isn't too difficult, as long as you stick to other well know functional programming good practices, (minimal mutability, maximal purity), you should be good to go. However it's otherwise a poor choice since it lacks necessary abstraction power to make this kind of technique viable. For example, to do anything remotely polymorphic, you would usually turn to type erasure and manual rtti, which will inevitably defeat the compiler.

C++ on the other hand seems more promising, there is a lot more you can do there at compile time, however ensuring that there is "nothing to return to" and the optimization will stick is harder. It's likely to be very unstable and finicky way to write code, but I find myself in pursuit of it anyway for some reason.

First thing I tried is to beef up my previous example with everyone's favourite these days - sum types (std::variant). Absolutely destroyed the optimization, calling into some random address, that wasn't even in code/text segment, labelled "...vtable...". The culprit was std::visit, which turns out does some form of type matching on steroids, to match/visit multiple variants at once, passing them to multi-parameter(what's the word I'm looking for?) function, I guess to be able to advertise it as a full-blown replacement for virtual member functions. Obviously the compiler would have a hard time seeing through all of that across translation units, so I had to write my own visit_first that would just match/visit one and forward the rest. That's the ugliest part of it all, but hey! it can go in the library (you are not a C++ programmer if you don't have a library :v).

Here is what I got so far:
git.sr.ht/~namark/funky/tree/5

I'm now tempted to write something a bit more realistic utilizing this technique, maybe a small game, so to be continued perhaps...

namark  

re: true co-routines in C 

boring c++ porn 

of course gcc 10.2 doesn't optimize my visit first anymore... it does if I inline it manually though... what is this regression? T_T

boring c++ porn 

seems like forwarding in conjunction with std::get is what is confusing it... without forwarding it works fine... now to isolate and figure out whether it's a bug or a feature -_-

Follow

boring c++ porn 

While I could maybe chase after this to a potential bug report, I think it wouldn't be taken seriously, especially if I'm pressed on a use case, which I myself consider dubious. It's also somewhat indicative that maybe my approach was not necessarily the best for this kind of an experiment. I went with forwarding, in my implementation of visit_first, since that's the standard for any kind of transparent wrapper function that's supposed to entirely disappear in most cases. Being a c++ standard however it has to handle references well, that can be used as mutable out parameters sometimes (the heresy!) and that is likely the crux of the problem. References to local variables make the job of the compiler much harder, so we're going to go with another common pattern, explicit value and move semantics - pass everything by value, use std::move to avoid copies. Getting rid of mutable references simplifies things quite a bit, but in general has a nuanced cost of the move constructor (though I might not be able to use any moveable objects either, due to non-trivial destructors, and have to stick with trivial value types).

This puts me back on track with gcc-10.2 now properly optimizing all the calls. Unfortunately this also means that visit_first is no longer a generic library function, it will have to stay in this project forever, and no longer living up to its current name, instead become funky_visit:
git.sr.ht/~namark/funky/commit

Also seems like clang doesn't do tail call optimization well unless you change the calling conventions (whatever that entails), so it's a win-win!

Sign in to participate in the conversation
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.