BLANK! 2022: Revisiting The Datatype

hostilefork · August 25, 2022, 1:50pm

In historical Redbol's meaning of the datatype NONE!, it had the bad habit of looking like a WORD!:

rebol2>> 'none
== none

rebol2>> none
== none  ; same in R3-Alpha and Red

But it wasn't a word:

rebol2>> type? 'none
== word!

rebol2>> type? none
== none!

It was a distinct type, which also happened to be falsey (while WORD!s are truthy):

rebol2>> if 'none [print "Truthy word!"]
Truthy word!

rebol2>> if none [print "Falsey none!"]
== none

And as we can see, NONE!s served purposes of signaling "soft failures": branches that didn't run, or FINDs that didn't find, or SELECTs that didn't select... etc.

rebol2>> find "abcd" "z"
== none

rebol2>> select [a 10 b 20] 'c
== none

Ren-C Divided NONE!s roles across NULL, VOID, and BLANK!

NULL - an "antiform" state of WORD! that couldn't be put in BLOCK!s. Anywhere that NONE! would be used to signal a soft failure operation--like FIND or SELECT--would use ~null~.

>> null
== ~null~  ; anti

>> find "abcd" "z"
== ~null~  ; anti

>> select [a 10 b 20] 'c
== ~null~  ; anti

>> append [a b c] null
** Error: APPEND doesn't allow ~null~ isotope

BLANK! was represented by a lone underscore ( _ ) and could be put into blocks:
```
>> append [a b c] _
== [a b c _]
```
At the outset, it retained the choice to be falsey:
```
>> if _ [print "Won't print because blanks are falsey"]
```

VOID - another "antiform" state, but not one you can store in a variable... hence an "unstable antiform". So decisions need to be made on how to handle them. Some places make them vanish, and when functions like APPEND get them as an argument they are treated as no-ops:

>> void  ; is a function that returns a void (can't store void in variable)
== ~[]~  ; anti (unstable)

>> when 1 < 0 [print "WHEN is IF variant that returns VOID not null"]
== ~[]~  ; anti

>> compose [abc (when 1 < 0 ['def]) ghi]
== [abc ghi]

>> append [a b c] void
== [a b c]

>> for-each void [1 2 3] [print "no variable"]
== ~null~  ; anti

Question One: Should BLANK! Just Be A WORD! ?

Ren-C allows you to use underscores internally to words, so it feels a little bad to take away one word.

Outside of historically being hardcoded as falsey, what makes BLANK! fairly "built in" is that in the path mechanics, it fills in the empty slots:

>> to path! [_ a]
== /a

>> as block! 'a/b/c/
== [a b c _]

There's other places the blank is used, such as to opt-out of multi-returns.

>> [_ value]: transcode/next "abc def"
== " def"

>> value
== abc

Question Two: Does BLANK! Still Need To Be Falsey?

My feeling is that having blank be falsey doesn't have all that much benefit. NULL does a better job of it, and really what it does is mess with its usefulness as a placeholder:

>> append [a b c] opt all [1 > 2, 3 > 4, _]
== [a b c]  ; doesn't make sense to me

>> append [a b c] opt all [1 < 2, 3 < 4, _]
== [a b c _]  ; this makes sense to me

Thinking of BLANK! as being "null-like" in terms of non-valuedness is generally a hassle. It makes you wonder about whether something like DEFAULT should think of it as being assigned or not:

>> item: _

>> item: default [1 + 2]
== ???

In practice, I prefer only non-array-element things (NULL, TRASH, etc.) being the only cases that DEFAULT overwrites. This is because NULL is far more useful than BLANK! when it comes to representing something that you think of as "not being assigned"... as you'll get errors when you try to use it places (e.g. in APPEND). Trying to use blank to represent nothingness invariably leads to stray appearances in blocks (Shixin wrote a lot of code to try to filter them out in Rebmake, prior to it being switched to NULLs)

This makes more sense, and I think it bolsters the argument that BLANK! is less of a falsey-NULL relative...but more of a placeholder value. I've said "blanks are to blocks what space is to strings". And space is truthy:

>> if second "a b" [print "Space is truthy"]
Space is truthy

>> if second [a _ b] [print "So why shouldn't blank be truthy?"]
???

So Either Way, I Suggest The Removal of BLANK! From Being Falsey. This creates some incompatibility in Redbol emulation (which has been using BLANK! as a "NONE!" substitute). But it's something that can be worked around.

rgchris · August 26, 2022, 1:55am

There's a lot to ponder here. I think on the one hand it's important to explore all of the possibilities, on the other it seems to be getting awfully convoluted and lacking a comprehensive narrative.

I'm not up to speed with much of what has changed in this realm for some time, so I apologise if this glosses over some since settled items, though judging by this post, there's much still unsettled.

For me (using the family name) Rebol's first obligation is to represent data—both in language and the way the language is interpreted in memory. Specifically BLANK! and its underscore literal is a huge win (this is from me, the ultra-conservative sceptic) in representing positive nothingness—that a thing exists but lacks assignation: [name: "Thing" link: _]. Despite that positivity, I do think that as it represents the known absence of a value in data, it should be falsey in the general flow that data primarily should determine that flow.

What it becomes in a dialect or within the general flow as distinct from NULL is of lesser importance as I see it. If NULL is the evaluator's ultimate representation of nothingness, then there should be a way to access that in internal dialects, such as SET-BLOCK! or PATH! and the like or it is not really fulfilling its role.

I have this sense that the BLANK-NULL-VOID-ERROR story has too many actors with overlapping roles. I don't have anything tangible to back that up with at this time.

hostilefork · August 26, 2022, 6:36am

Well, that's something.

Hence you are on the side of "Taking underscore away from the word pool does more good than harm."

I'm trying to make a general engine... so it will be possible to do Redbol compatibility, and if you want different rules you should be able to have them. But the core as I see it is the "default" distribution which should be based on what is reasonably determined the "best" and most coherent.

It is a work in progress...but...I believe there's plenty of evidence that things are pointing toward a solid outcome.

The proof comes from the code: the contrast between what the approaches without it can't do (and how catastrophically they regularly fall down) vs. what the approaches with them can do cleanly and correctly.

UPARSE is a giant piece of evidence, but I think there's quite a lot more.

hostilefork · March 1, 2024, 12:38pm

I can argue pretty strongly for all the behaviors as being shades of distinction that are important.

I'll mention that I just addressed a weakness, which was that because it was unstable, VOID didn't have a "good" representation in a block that identified it in the class of "weird states". e.g. it didn't have a quasiform. Now it does: as a quasiform of the empty block.

This means these are your stock "reified" options for shades-of-nothingness:

 [name: "Thing" link: _]
 [name: "Thing" link: ~]  ; ~ is "quasi-blank" a.k.a. "trash"
 [name: "Thing" link: ~null~]
 [name: "Thing" link: ~[]~]  ; can't actually assign variable this way!

Each of these forms have different behaviors once you evaluate them (or DEGRADE them, e.g. degrade fourth [name: "Thing" link: ~null~] narrowly turns the quasiforms to antiforms without doing any transformation of other types):

 >> x: _
 >> print either x ["truthy"] ["falsey"]
 truthy
 >> append [a b c] x
 == [a b c _]

 >> x: ~
 >> print either x ["truthy"] ["falsey"]
 ** Error: x is not set (~ antiform), see GET/ANY
 >> print either get:any 'x ["truthy"] ["falsey"]
 ** Error: TRASH (~ antiform) is neither truthy nor falsey
 >> append [a b c] x
 ** Error: x is not set (~ antiform), see GET/ANY
 >> append [a b c] get:any 'x
 ** Error: APPEND expects [<opt-out> element? splice?] for its value argument

 >> x: ~null~
 >> print either x ["truthy"] ["falsey"]
 falsey
 >> append [a b c] x
 ** Error: APPEND expects [<opt-out> element? splice?] for its value argument

 >> x: ~[]~
 ** Error: VOID (~[]~ antiform) is unstable and can't be assigned to variables
 >> print either void ["truthy"] ["falsey"]
 ** Error: VOID (~[]~) antiform is neither truthy nor falsey
 >> append [a b c] void  ; VOID is a function, not a variable
 == [a b c]

So...I'm afraid that BLANK!'s relationship to nullness and falseness has basically gone away. Instead, it's the "space unit" of BLOCK!s--the moral equivalent of a space character in a TEXT!.

BLANK! has important uses that make it good to be a non-reassignable unit type, taken away from WORD!. Crucially it now provides the heart of the TRASH antiform to represent unset variables--using a similarly light-looking antiform/quasiform of ~

UPDATE 2025: It is now also the basis of the "sigils" [@ ^ $]
>> pin _
== @

>> tie _
== $

>> lift _
== ^

Also, I've mentioned BLANK!s application in PATH! and TUPLE!:

>> to path! [_ x]
== /x

 >> to path! [x y _]
 == x/y/

...as well as in multi-returns:

>> [_ {end}]: find "abcdef" "cd"  ; opt out of main find result, just get tail
== "ef"

>> end
== ef

Having it be out-of-band and not a WORD! is a strength in these areas.

Having SPREAD of a BLANK! return a VOID or an empty splice may be a good thing... though not completely sure on the merits of choosing one over the other. An empty splice may be "more coherent" in the sense that one probably wouldn't want foo spread _ to give back null if foo spread [] would not. Considering it "EMPTY?" in certain contexts may be appropriate as well...but not falsey.

A year down the road here, things are clicking in place. Though it would have been massively helpful to have a time machine and send a few of these posts a few years back. I'm patching a bootstrap executable to be compatible with many of the conventions, and it's pretty quick work when you know what the decisions are.

(I was just watching a video laying out the difficulties in creating the blue LED (recommended)... and it's so comprehensive on semiconductor technology that sending that one video back in time would have radically changed the course of history.)

I've got a pretty solid sense that the best substrate comes when everything you can PICK out of a block is inert and truthy, where this kind of thing holds:

backup: copy block1
block2: copy []
while [value: try take block1] [  ; you can also shim TAKE as synonym for TRY TAKE
    append block2 value
]
assert [block2 = backup]  ; always true, for any and every BLOCK! (GROUP!, etc.)

So it's upon you--the interpreter of the block--to give it meaning. If you want to know if something is blank, you say BLANK?. If you want to DEGRADE things, you do that.

I've surveyed the most code written by the most people...and maintained giant systems after those who wrote them wandered off. And I've tried a lot of things. What it's converging on is what I believe to be the best direction for this medium.

But the proof should be in your code, as well. I've held off on advocating people spin their wheels porting scripts while things are in flux, but as the flux diminishes I think it's worth it to do some porting of key Rebol2 scripts and document the experience.

A little more work on binding first... but... the time is coming.

hostilefork · March 6, 2024, 1:19pm

Okay, I think this kind of sums up the difference here:

BLANK!s aren't null-equivalents or falsey, they are EMPTY?

You (@rgchris) hopefully don't expect an empty block to be falsey.

EMPTY? is a test which can work across both blanks and empty blocks (and empty strings, binaries...), to say they are values that are intentionally empty. And then EMPTY? on null can be an error.

I'm a bit reticent to say that a blank can be passed anywhere you'd pass a void to... but rather they can be passed anywhere you can pass an empty block (or empty string?) to, and give you back the same meaning. That's actually an interesting point: if the meaning for an empty block and empty string would be different when passed to a routine, then I don't think blank should play favorites in acting like either, because it doesn't connote any particular kind of emptiness.

I have some other thoughts here about how BLANK! seems to be useful as a way of fitting into places that want to say they are conceptually holding series, but want to avoid the creation of a series identity. The issue being that you wouldn't so much mind writing [] in these slots except for the fact that what you really need is copy [] which gets ugly...and with just _ you push the responsibility of making the series to whoever starts expanding it. But then, if you're making a prototype of an object that's going to get copied that isn't enough to get new copies in the instances... which points to a deeper problem that BLANK! is only papering over. That needs a bigger discussion, but other things need to be sorted out related to objects first.

hostilefork · September 9, 2024, 9:12pm

So further in my thinking of saying BLANK!s are empty? is that we can ask "What is the LENGTH OF a BLANK!"

Trying to shape up the semantics for consistency, I think the LENGTH OF a BLANK! is 0.

BrianH didn't like the idea of R3-Alpha LENGTH? of a NONE being 0. But having split out the roles of nothingness to a finer granularity, we can say:

LENGTH OF BLANK is 0
LENGTH OF QUASAR is ~error~
LENGTH OF VOID (antiform) is NULL
LENGTH OF NULL (antiform) is ~error~
LENGTH OF TRASH (antiform) is ~error~

This heeds my policy of saying that if what the routine did would be different e.g. for a string or a block, then blank shouldn't give an answer. But here, both an empty string and an empty block say 0, so I think the length should be 0.

What About REMOVE-EACH and BLANK!

So this is an interesting one, because here you're asking to modify the input in a way that only removes elements from the input...and then returns it and the count.

>> s: [1 2 3 4 5]

>> [series count]: remove-each num s [even? num]
== [1 3 5]

>> series
== [1 3 5]

>> count
== 2

While you can't APPEND to a BLANK! meaningfully, it would be reasonable to argue that you can REMOVE-EACH from a BLANK!...because there are no elements you can remove, and so you can give back the blank and 0.

>> [result count]: remove-each x _ [fail "this part never runs"]
== _

>> result
== _

>> count
== 0

We can do that...but should we?

I'm not sure, but I do feel like this is helping shape the policy on what blanks do. You don't pass blanks into routines and get nulls out when an empty series would not do that. (This is what REMOVE-EACH was doing previously for blank, and I think that was wrong.)

hostilefork · May 15, 2025, 8:26am

Some Good News On This...

...the basic concepts in virtual binding that power RebindableSyntax should be able to provide the hooking I have sought, whereby a construct like IF or CASE can defer to a concept of conditional testing that comes from its calling environment.

The default notion that would drive the mezzanine and such will be that only antiform NULL is "falsey". But you could redefine this for your script, or even at the granularity of what is useful within a certain function.

The technique has been shown to work, and it just needs to get grafted into more natives (and the name for the function chosen). I have some reasons why I'd like this function to be called CONDITIONAL or COND, and return either a VETO definitional error or VOID (vs. ~null~ or ~okay~)...but that's an explanation for another post.

I don't know that being able to override this will turn out to be as useful as you might think. But my hope here has always been to let everyone have their own way within their customization environment. I seek to align the foundations in order to make a system whose internals mechanically work across an infinite number of arbitrary programs. But after that, there's no rule that the definitions used for any given script have to "work" for any more or less than the particular problem it is tackling.

(But the more general the problem and generic your script or library is, the more likely you'll want to be using the default choices...they were picked for a reason.)

Some More News That I Think Is Very Good...

I think that underscore needs to be the lexical form for the space character.

_ would thus not be BLANK! as a distinct type, but SPACE? (as a type constraint of the RUNE! fused issue!/char! type).

This won't opt out of enumerations in the way I was envisioning BLANK! might, which makes it not fit for some intents of nothingness.

But an empty splice antiform... ~()~ will be the new BLANK

Unlike VOID, an empty splice can be stored in variables.

>> var: blank
== ~()~  ; anti (blank)

>> append [a b c] var
== [a b c]

>> append "abc" var
== "abc"

An empty splice will always be found in a series, while a VOID never will. So you might think of blank as a kind of "positive nothingness", while void is "negative nothingness"

>> find "abc" void
== ~null~  ; anti

>> find "abc" blank
== "abc"

If we added BLANK to the things that DEFAULT was willing to overwrite, it could do a decent job of being variable-assignable-nothingness that still was legal to fetch and wouldn't give errors when using in series operations the way a null would.

I'm hoping you'll be finding that you were mistaken, the design has just the right number... even though it's far more when you add it all up, including TRASH (antiform issue), QUASAR (quasi issue), QUASI-BLANK (quasi empty splice), QUASI-NULL (quasi word)... packs and splices, the whole lot.

Everything is related through a coherent system, and comes together beautifully...

hostilefork · May 17, 2025, 5:34am

This Variable Intentionally Left Blank...

The "opt in with nothing" behavior falls out naturally a lot of places that already take splices.

Consider ENVELOP w.r.t NULL, where it can be voided to opt out:

>> var: null
== ~null~  ; anti

>> envelop [(* *)] var
** Error: ENVELOPE's CONTENTS argument is ~null~ antiform

>> envelop [(* *)] opt var
== ~null~  ; anti

But BLANK! as an empty splice works like other splices:

>> var: blank
== ~()~   ; anti (blank)

>> envelop [(* *)] var
== [(* *)]

>> envelop [(* *)] opt var  ; opt should not voidify empty splices
== [(* *)]

>> envelop [(* *)]] [a b c]
== [(* [a b c] *)]

>> envelop [(* *)] spread [a b c]
== [(* a b c *)]

This applies places like MOLD as well.

>> mold void
== ~null~  ; anti

>> mold [a b c]
== [a b c]

>> mold spread [a b c]
== "a b c"

>> mold blank
== ""  ; consequence of blank as empty splice

There may be "unnatural" places, e.g. spots that wouldn't know what to do with a splice that would be interested in just the blank intent. I haven't found them yet, but will keep my eyes open.

It's very satisfying how as I pushed on what the properties of BLANK needed to have, that empty SPLICE! meets those properties...just by its nature.

I'm quite glad that (_) => SPACE so that the good name BLANK could be taken for empty splice, vs a bad name (e.g. HOLE)

(I should have known that with the properties I was describing, BLANK had to be an antiform.)

hostilefork · May 19, 2025, 11:14am

I'm starting to believe that although _ has moved to being the character literal for SPACE, that the idea that you need to think of a RUNE! ("issuechar!") as having iterability or length may be outweighed by the usefulness of _ serving its positive-nothingness role in various places.

>> stuff [10 20 30]

>> variables: [x _ z]

>> for-each '$var variables [set var stuff.1, stuff: stuff:next]

>> x
== 10

>> z
== 30

(Note I snuck in two little proposals there... the idea that if your iteration variable is TIED!, then it will use the block's context to bind it during the FOR-EACH. Also that instead of next of stuff you could say stuff:next)

Anyway, this steps a bit back from the precipice of saying you always have to perform some unreifying operation to get the "blank" behavior. Though it's going to be case-by-case. My example of ENVELOP clearly doesn't make sense to use _ to "opt-in-with-nothing", and it never has made sense... the empty splice is what you want:

>> envelop [] _
== [_]

>> envelop [] blank
== []

As for something like FOR-EACH, I dunno...

>> for-each 'x _ [print "What about this?"]
== ~[]~  ; antiform (void)  <-- is this the best answer?

@earl was always adamant that you shouldn't get strings back as character elements from strings, e.g.:

>> second "abc"
== "b"

So he would probably say that if you could pick elements out of a RUNE! at all, you wouldn't be picking runes, but integer codepoints or something...

>> second #"a b"
== 32

We could say that if you really want to sub-index into a rune! for some reason, you could do that by aliasing it as text:

>> for-each 'x (as text! #a) [print mold x]
== #a

>> for-each 'x #a [print mold x]
** Error: Can't iterate RUNE!, alias as text if you want that

This could allow _ to take on more of the desired properties of a "blank", such as being EMPTY? and having (length of _) be 0.

Maybe there's a sensibility to saying that # has a length of 1, so that # and _ can function as a kind of postive-space and negative-space counterpart?

>> for-each 'x _ [print mold x]
== ~[]~  ; antiform (void)  <-- e.g. it never ran

>> for-each 'x # [print mold x]
#

I think we may want to go this route--limiting the operations on rune! in order to open up "space" for these kinds of dialect-enabling behaviors. Taking the SET-BLOCK dialect as one example, I'm definitely seeing the power of building on top of a SET that already knows how to pass thru assignments on _ without me having to explain how to do that.

And I understand the desire to avoid the REIFY/DEGRADE or lifting/unlifting when possible to do so. So definitely trying to stay in tune with this motivation.