C/C++ ABI Compatibility: Single-Element Structs?

I have the concept that natives return something called a Bounce.

A Bounce is a superset of a Value*, and in C it's just a const void* (because there aren't any other polymorphic base classes that don't violate strict aliasing rules).

So in C:

typedef const void* RebolBounce;

That's dangerous in the type system, because you can return a pointer to any type and it will compile (when only a certain set of things the system can "sniff" and discern are actually legal.)

So the C++ build is more clever. It makes Bounce into a structure with specialized construction:

struct Bounce {
    const void* p;  // the single element in the struct

    Bounce (Value* v) { ... }  // final result from an API value
    Bounce (Level* L) { ... }  // continuation of a level
    Bounce (const char* cp) { ... }  // delegation to scanned source code
    Bounce (nullptr_t) { ... }  // return a ~null~ antiform
    ...  ; and so on
};

Unfortunately, these two forms are not "ABI-compatible".

What that means is that if you compile the core with the struct definition of Bounce and an extension with the typedef... or if you compile the core with the typedef and the extension with the struct... these won't work together.

That has caused a rift in the definitions--where the "external" part of the system uses different expectations (the typedef) than what's internal to the core (the struct definition if it's a C++ build and it's a debug build).

There's No Way Around This, Unless...

The only real way to work around this would be to say that Bounce is always a single element struct, and if you want your code to compile as C your natives have to make a struct manually.

Returning Text Strings Would Become rebDelegate()

Instead of writing:

rebElide("append block data");
...
return "first block";

You'd could write:

rebElide("append block data");
...
return (Bounce){"first block"};

But that's awkward. This was always a shortcut for rebDelegate() passed a single string (rebDelegate is variadic and can do composition, so it's much more powerful):

rebElide("append block data");
...
return rebDelegate("first block");

And if you were committed to using C++, you could bypass this and just use the string as before.

Returning Values could be rebOut(v) or similar

So instead of writing:

 Value* v = rebValue(...);
 ...
 return v;

You'd could again write the painful:

 Value* v = rebValue(...);
 ...
 return (Bounce){v};

Or use something that did that like rebOut():

 Value* v = rebValue(...);
 ...
 return rebOut(v);

It could also be called something like rebResult(v), rebReturn(v) w/shorthand rebRet(v) etc. (I think what worries me a bit about "rebReturn(v)" is that suggests a hooked RETURN would be executed, like a synonym for rebDelegate("return @", v);

Once again, if you committed to a C++ build for your extension you could just say return v; as usual.

Worth It (Or Does It Lose The "Magic"?)

I'm torn. It really would make the mechanics better if the Bounce was always a struct. :face_with_diagonal_mouth:

I kind of hate the idea of writing return rebOut(v);. The effort expended to slice up the bit-space to differentiate between legal leading bytes for UTF-8 strings and the various internal entities feels like it's going to waste somewhat if you have to use a function that produces what looks like a different "type".

But I think I just can't sacrifice the cool C trick.

So there's another concern to weigh in:

There are generic macros in my "fake Rust" library that require return results to be able to be constructed from the 0 integer literal in C...and this simply can't be done with structs.

So the headers have to diverge in the definitions. Functions like rebDelegate() must return a const void* in the C build and a RebolBounce struct in the C++ build.

These mechanisms already exist and are working (the definitions for API function wrappers are different in C and C++ in the header already.)

I Think I Get What Needs To Be Done

To accomplish ABI compatibility the natives would do something like:

#if defined(__cplusplus)

    #define DECLARE_NATIVE(name)                                      \
        static RebolBounce cpp_impl_##name(RebolContext* binding);   \
        extern "C" const void* native_##name(RebolContext* binding) { \
            return cpp_impl_##name(binding).p;                        \
        }                                                             \
        static RebolBounce cpp_impl_##name(RebolContext* binding)

#else

    #define DECLARE_NATIVE(name) \
        const void* native_##name(RebolContext* binding)

#endif

Anyway, it's really just another epicycle of things I've already looked at.

TL;DR: The Nifty "Magic" C Syntax Stays As-Is

It's just not very type-safe if you're not building it as C++ (at least sometimes in your CI to check it).

There's a bit more #ifdef'ing that I need to add to the headers, making the C++ and C builds a slight bit more different than they already are.

1 Like