Mitigating Drift of Underscore Meanings in R3C

So I know that I was the one who propagated the idea of _ being a representation of "nothing here", with the BLANK! datatype... an invention from the early days of Ren-C.

I wanted to reserve it, so it couldn't be reassigned. That's still true today: underscore can reliably be used in dialects without concern that it has been given a value:

>> for-each _ [a b c] [probe _]  ; signal for "no iteration variable"
_  ; not "a"
_  ; not "b"
_  ; not "c"

It's thus unreassignable, and it's used other places for this purpose. Here's that emphasized with a little multi-return demo of the niceness of Ren-C find:

>> pos: find "abcdef" "cd"

>> pos
== "cdef"

>> [start end]: find "abcdef" "cd"

>> start
== "cdef"

>> end
== "ef"

>> [_ end]: find "hijklm" "jk"

>> _
== _

>> end
== "lm"

So underscore still is serving its roles, as it did when it was BLANK!.

HOWEVER...

_ is No Longer BLANK!, It's A Lexical SPACE Character

Underscore is now a "RUNE!" (the unified class of ISSUE! and CHAR!), and it represents a space.

>> form _
== " "

>> rune? _
== \~okay~\  ; antiform

See "Reified Unreassignable Nothinginess: SPACE RUNE!s"

Anything that has only underscores in its representation is still a RUNE!, just with more spaces in it:

>> tab: ____

>> rune? tab
== \~okay~\  ; antiform

>> form tab
== "    "

So you can persist and sense how many underscores are there.

But if a token is not all underscores, then underscore is a word character.

for-each 'w [_foo_ _bar baz_ foo_bar _bar_baz] [assert [word? w]]

I've had no regrets about this choice. And since BLANK! had stopped being falsey a while beforehand, the transition to truthy space was no problem.

(A singular falsey state of null--that is an antiform and can't appear in blocks--is also something I have no regrets about. I'm eager for people to see all the grand benefits this has brought.)

My Regret Is My Original Choice Influenced @rgchris Strongly

Here's what he said in the README.md of Ren-C scripts in GitHub rgchris/Scripts:

[I found it beneficial that Ren-C added] "... the literal _ for NONE! values. This has proven effective on syntactic, semantic, and cognitive levels providing a subtle and intuitive solution to a longstanding omission in Rebol grammar. Surveying code and data—technically the same thing in Rebol—is greatly enhanced by this change and should be a part of the Rebol lexical canon.

He convinced Oldes to make it a notation for NONE!, so this is giving rise to underscores everywhere:

   copy #[
        type document
        name _
        public _
        system _
        form _
        head _
        body _
        parent _
        first _
        last _
        warnings _
    ]

   intersect node either 'document = node/type [
        #[type _ name _ public _ system _]
    ][
        #[type _ name _ value _]
    ]

Only One Single-Character Alternative

The only single-character alternative I have to offer at the moment would be antiform space, the ~

   intersect node either 'document = node/type [
        #[type ~ name ~ public ~ system ~]
    ][
        #[type ~ name ~ value ~]
    ]

But under evaluation, that produces something that is not falsey... antiform runes are trash.

I will say that in Ren-C programming, I do encourage thinking about choosing between null and trash. These are distinct intentions:

obj: make object! [
     field1: ~
     field2: null  ; or ~null~ if you want a quasiform vs. variable lookup
]

Both are considered empty for the purposes of DEFAULT. But accessing field1 will error, while field2 will not error on fetch and be falsey.

One should choose wisely...and also note that since trash is an antiform RUNE!, it can hold a string, which turns out to be very useful. See Labeled Trash RUNE!s in the Wild for examples.

Immovable Objects, Unstoppable Forces

Underscore cannot represent null intent directly: it's unreassignable, by design.

The only answer would be some kind of object/map-making dialect. Especially since the emerging pattern is to not use SET-WORD!s either.

Let's say this dialect is actually what {...} does (or maybe {{...}}?). RUNE! literals aren't allowed, let's say you put them in groups, and groups evaluate:

    intersect node either 'document = node/type [
        {type _ name _ public (#a) system _}
    ][
        {type _ name _ value _}
    ]

Anyway, porting modern @rgchris code is likely best going this route, vs. disrupting it more.

Hopefully we can agree on a good dialect here. But even if we can't ultimately agree, remember that with RebindableSyntax, you can customize what {...} does (per user, per module, per function...)


UPDATE: Moved CONSTRUCT dialect for {...} discussion here:

https://rebol.metaeducation.com/t/construct-dialect-for-objects-maps/2569

...so this thread can remain on topic specifically about underscores and R3C.

2 Likes

3 posts were split to a new topic: CONSTRUCT Dialect for Objects (Maps?)

One example of a behavior where underscore is acting like a "nothing" is e.g. with SET.

>> set _ 1020
== 1020

The reason this is allowed is for consistency with the aggregate assignment:

>> set [a _ c] pack [1 2 3]
== \~('1 '2 '3)~\  ; antiform (pack!)

>> a
== 1

>> c
== 3

So if you think of that as 3 separate SET instructions:

>> set $a 1 
== 1

>> set _ 2
== 2

>> set $c 3
== 3

This formulation helps us reason about why setting a space works. Unlike in APPEND where the "thing-ness" of space is of prime importance, the variable-to-set slot is a place where this thing-ness is not what's important about it. It's ergonomically easier to allow you to just use the space there than to force people to transform it into a void.

I'm not sure how far and wide the tolerance of space as meaning nothing should go.

One extreme notion would be to say that if you pass a space to a function that doesn't accept spaces, and the typecheck fails, but it's marked <opt>, that it will accept the space as if you'd passed a void.

This may systemically bring the property that @rgchris wants where it can be brought to bear. APPEND wouldn't count because it typechecks the space, but SET might not have to explicitly say it accepts space because when the typecheck for the space fails it realizes it can translate it into the behavior as if you had said void.

I'm not going to jump straight to that because it feels a little bit dangerous; the fact that you don't take a space today may not mean you won't take it tomorrow. And if people have been assuming they can pass a space just as they could a void, you could add in the rune! to the typecheck and have callsites break.

So for now SET will explicitly mention it takes space.

I am adamant that RUNE! is the right choice for "blanks" and that falsey was always the wrong choice for a reified in-block non-antiform value.

But I've made another change, and hopefully @rgchris will like it... to go back to using the name "blank" for space runes... there's just more of them now!

Any RUNE! which is all space characters answers truthy to the question BLANK? Everything else answers with null, so it makes blank kind of like a datatype...

>> blank? _
== \~okay~\  ; antiform

>> blank? #a
== \~null~\  ; antiform

>> blank? ____
== \~okay~\  ; antiform

>> blank? [a b c]
== \~null~\  ; antiform

This means that you can point at an underscore and say "that's a blank" and be telling the truth, as you would be if you said "that's a space". Either term can apply.

Much Better Than Empty SPLICE! Being "Blank" :nauseated_face:

When underscore became SPACE and the name BLANK "became available", I tried using it for empty splices.

That was unpleasant and confusing. I kept mixing it up, and kept wanting to call underscores blanks. Correcting myself got tiresome, so I decided it was better to call empty splice a "hole".

There's lots of ways to make empty splices:

>> spread []
== \~()~\  ; antiform (splice!) "none"

>> first [~[]~]
== ~[]~   ; <-- not a none, but a "lifted" none (quasiform block!)

>> eval [~[]~]
== \~[]~\  ; antiform (splice!) "none"

>> ~[]~
== \~[]~\  ; antiform (splice!) "none"

But it's nice to have a WORD! for it:

>> none
== \~[]~\  ; antiform (splice!) "none"

>> append [a b c] none
== [a b c]

>> append "abc" none
== "abc"

>> append #{AABBCC} none
== #{AABBCC}

Fewer Fundamentals, But: More Parts, More Power

If you accept that a language needs to have a representation for a space character literal, then you won't begrudge us the "too many notes" aspect of having "blank".

And this speaks also to some @rgchris criticisms:

The roles are just about fully crystallized, and the decisions of when-to-use-what flow automatically.

While I admit that it's been an imperfect journey, I think I've been on a clearly evolving path, which has had the right goals as the design has moved through the necessary decisions and inventions. (The shuffling of words has been a necessary part of that journey, albeit I'm sure it hasn't helped anyone trying to grasp it. I've tried to retcon the forum posts to make the actual steps of the invention process possible to follow.)

In updating a response to a question from @LkpPo with similar criticisms, I see how really strong the choices are:

Clarity + Brevity vs. NULL, BLANK, TRASH, VOID... - #3 by hostilefork

...but also strangely how things are starting to point very much at a pointed "NULL and VOID" duality as the real fundamental parts of "nothing"...with things like empty splices and empty packs or trash serving their well-defined roles.

...bringing it in line kind of with historical NONE! and UNSET!, just in a very futuristic isotopic way!