"Extension Types" Implementation

The new thing that has come on the scene isn't what I'd really call "user defined datatypes" as much as "extension defined datatypes". It's for C programmers to implement types like IMAGE! or GOB! with a DLL or statically linked module...without those being built in a-priori.

The feature's goal was to get past a historical property that limited Rebol to 64 built-in datatypes, which had to be named in the core interpreter and could not be changed or extended. Ren-C wanted to be much more modular...to avoid carrying the weight of things like GOB! to the JavaScript build (or a redundant IMAGE! datatype that was handled by a browser's canvas.) Then the web build could choose its own extension types, perhaps some kind of JAVASCRIPT-OBJECT! proxy or a CANVAS!, etc.

This was needed during the breaking the project up into independently selectable extensions--of which there are now 31. See the README.md for a few notes:

https://github.com/metaeducation/ren-c/tree/master/extensions

(At this time, the web build uses only JavaScript, Console, and Debugger.)

Implementation Details

A "value cell" in Rebol and Red are four platform pointers in size. Of these, the first platform pointer slot is used as bits for a "header". How the other three pointers are interpreted depends on a byte in that header...which was called the VAL_TYPE() in R3-Alpha (though Ren-C calls this the "cell heart").

Of this byte, only 64 of the states are used in R3-Alpha--and I believe Red. This was chosen instead of 256 in order to limit the number of kinds that need to be handled in a TYPESET! to 64 bits...making typesets small enough to fit in the rest of the cell. Ren-C has broken this barrier to something nearer to 256 fundamental types (plus builtin typesets), but that's just for a finite number of built in things..

So "extension types" are a primordial implementation of a strategy to reserve one heart byte to mean "this cell gives up one of its three non-header platform-pointer-units to be a heap pointer to information about its type and its behaviors". That allows an arbitrary number of these to be added. They can't pack quite as much data into their cells as the built-in types, since they only have two pointers instead of three to work with. But given that you can always point to some allocated data (and usually need to), it's not a big problem.

Open Questions

How datatypes will participate in a naming ecology is not known. Right now the theory is that they register via a URL!. That is to say that type of foo could come back as something including http://example.com/types/matrix. While that's a bit drawn out, one idea that came up in error IDs was that there might be a form of comparison function that lets you get as specific as you want about that... e.g.

>> /matrix submatches http://example.com/types/matrix
== #[true]

>> /types/matrix submatches http://example.com/types/matrix
== #[true]

There's still plenty still to worry about. But the first tier goal of being able to build variants of Rebol without GOB! or IMAGE! or VECTOR! or STRUCT! (or mentioning them in built-in type table), while still keeping all those features working has been achieved.

1 Like