Intrinsics: Functions without Frames

hostilefork · October 15, 2023, 5:25pm

Redbol's historical type system really had only one design point: be fast. There were 64 fundamental datatypes, and parameters of a function could either accept each datatype or not. So a simple bitset of 64 bits was stored alongside each parameter, and checked when the function was called. That was it.

Ren-C's richer design explodes the number of "types" in the system. Not only are there more fundamental types, but antiform isotopes like ~null~ are variations on WORD!, but you don't want every function that takes a WORD! to take nulls...and you don't want to have the type checking be so broad as to take [antiform!] just because you want to be able to take nulls (because that would include splices, packs, etc.)

It's not just this reason that Redbol's type checking was too simple, but it forced my hand in coming up with some sort of answer. I couldn't think of any better idea than Lisp, which does type checking via functions ("predicates"). So I rigged it up where if you want to say a function can take an integer or null, you can write [null? integer!] You can freely mix LOGIC-returning functions with fundamental types, and we're no longer stuck with the 64 fundamental type limit.

Isn't It Slow To Call A List of Functions For Typechecking?

It can be. And in particular, it can be if you have to go through calling those functions twice.

Why twice? Because of "coercion". For example, if you pass a pack to a function that expects packs, you'll get the meta-pack:

>> foo: func [^x [pack?]] [probe x]

>> foo pack [1 "hi"]
~['1 '"hi"]~

But if your function didn't want packs, but wanted the type the pack decays to, it has to work for that as well:

>> bar: func [^x [integer?]] [probe x]

>> bar pack [1 "hi"]
'1

Did the function want the meta form or the meta-decayed form? There's no way of knowing for sure in advance. The method chosen is to offer the meta form first, and if that doesn't match then the decayed form is offered.

It didn't know before walking through the block of functions to typecheck that a pack wouldn't have been accepted. So it had to go through offering the pack, and then offering the integer.

But I Noticed Something About These Functions...

Typically these functions are very simple:

They take one argument.
They can't fail.
They don't require recursive invocations of the evaluator.

This led me to wonder how hard it would be to define a class of actions whose implementations were a simple C function with an input value and output value. If you weren't in a scenario where you needed a full FRAME!, you could reach into the ACTION's definition and grab the simple C function out of it. All these functions would use the same dispatcher--that would be a simple matter of proxying the first argument of a built frame to pass it to this C function.

I decided to call these "intrinsics", which is named after a trick compilers use when they see certain function calls that they implement those functions via direct code inlining. It's not a perfect analogy, but it's similar in spirit.

It Wasn't All That Hard To Implement (relatively speaking )

All of the native function implementations were assumed to have the same type signature, taking a frame as an argument. I took away that assumption and added an /INTRINSIC refinement to the NATIVE function generator. If it was an intrinsic, then the C function in the native table would take a single value argument and an output slot to write to.

So it's still one C function per native. But if it's an intrinsic, then the function is not a dispatcher... the Intrinsic_Dispatcher() is used, and the C function is poked into the properties of the function.

Callsites that want to optimize for intrinsics just look to see if an action has the Intrinsic_Dispatcher(), and if so they have to take responsibility for procuring an argument and type checking it. But if they do, they can just call the C function directly with no frame overhead.

This helps make the switchover to functions in type spec blocks much more palatable. It's never going to be as fast as the bitset checking, but it's fast enough to allow things to make progress.

hostilefork · November 9, 2024, 5:59pm

I realized that I could use a normal native dispatcher. But simply have the parent level that's calling the intrinsic put a flag on its own level (LEVEL_FLAG_DISPATCHING_INTRINSIC). If the native sees this flag is set, then it knows the Level* it is receiving is not its own Level (with its own arguments in a frame)... but the parent Level.

Then--by convention--there are two cells worth of data in the parent. One is the SPARE cell (used for intermediate GC safe calculations). At the moment of calling an intrinsic, the parent commits the SPARE cell for the single argument to the intrinsic.

The second cell is called the SCRATCH. This has a specific purpose in the evaluator--to hold the currently evaluating value. So that's a good place for the Intrinsic native to look for its own function--so it can pick out any instance data (e.g. a typechecker which has only one C function to implement checking for all types can look at that ACTION! instance to get the per-typechecker information of what type is intended to check).

Because all Levels have SPARE and SCRATCH, this means any of them can call an Intrinsic and pass their own Level* to the native dispatcher function.

So I realized that not being able to fail was a bad constraint.

In particular, because intrinsics need to implement their own typechecking on their single argument. Having to call into the generalized typechecking system as a sub-step in calling a fundamental typechecker is not good... especially when that call can't be made intrinsically (because you're in the middle of an intrinsic call that's already using the SPARE and SCRATCH!)

So failing is allowed, simply by making the functions that need to be aware for stack purposes know about LEVEL_FLAG_DISPATCHING_INTRINSIC. There aren't very many such functions... just putting a stack trace on failures, really. (Needs to be able to peek in the SCRATCH cell for the action and label to report, vs. the usual place... and not skip the contribution of the parent Level that is being reused).

Anyway--it's snappier. While I generally push back against doing optimizations, intrinsics are pretty much required for the type-constraints-as-functions to be tolerable at all.