Why have an "Unset State" in Rebol Languages?

Could you explain in layman's terms why unset! is there in Rebol type languages, and why it's there?

To my brain, something like none makes more sense to use for 'no thing'. I get that unset removes the reference between word and value(s), but unset pops up for "no return value", even though it's a return value.

"UNSET!'s" existence is ostensibly from wanting to give a distinction between a variable that is not set (should cause an error if accessed) vs. set. Rebol would be a more unsafe language if such a state were not available. You'd just go typing if my-varable = 1 [...] when you meant to say my-variable and there'd be no indication of a problem.

Note that the reason you don't get a "no binding" error for a misspelling comes from the somewhat liberal choice to bind words dynamically in the user context just because they appear. It's the reason you can write:

>> print-foo: does [print foo]  ; gets bound here
>> print-foo
** error (it's in bound state but not set)
>> foo: 10
>> print-foo
10

The foo in print foo gets a binding to the user context, even though there was no foo defined there beforehand. If that weren't the case and it expanded on seeing set-words only, there'd be more safety against typos...which might make it somewhat less critical to have a "defined-but-not-set" state.

Yet I'd still argue that typos aside, it's valuable to have knowledge of when you've used something before you've explicitly assigned it. So this "bound yet triggers an error in casual usage" state is generally useful.

But I agree with you on the next point... that conveying this state of a variable as a "value" which has a "type", and can be inserted into a block, is a bad idea.

One could imagine it not being conveyable at all except through an is-variable-set? test for those concerned with the status...and other forms of access would be errors. But one could also designate it a special transitional status that simply can't be put in blocks.... which is what Ren-C does.

So Ren-C's TRASH and NULL and VOID are not thought of as "array elements". "set?" or "unset?" are questions asked of variables, e.g. SET? 'X or UNSET? 'X.Y. The system protects against any instances of a [trash or isotope] making it into the body of a block or other array.

Now once you have non-array-element things, it would be a bit of a shame to waste their unique status. Ren-C uses NULL as the outcome of a failed conditional, while forcing all successful conditionals to some value:

>> if 1 = 2 ['d]
== \~null~\  ; antiform

>> block: copy [a b c]

>> append block if 1 = 1 ['d]
== [a b c d]

>> append block if 1 = 2 ['d]
** Panic: Can't append \~null~\ antiform to a list

>> append block lift if 1 = 2 ['d]
== [a b c d ~null~]  ; quasi of word! null is meta of null

Since you never meant to intentionally append nullness itself to a block (as that is impossible) you have this extra dimension of flexibility.

The out-of-band nature is taken advantage of in operations, which differentiate between no value and a legitimate blank value:

>> block: copy [a _]

>> take block
== a

>> take block
== _

>> take block
** Error: Could not TAKE from empty block, use TRY if intentional

>> try take block
== \~null~\  ; antiform

Similar solutions help one put blank values in maps, and distinguish that from the absence of a value.

To summarize: the TRASH state which is helpful in denoting a transient situation of "absence of value" becomes much more useful when you can confidently say it's never stored in blocks or other data structures. Then it can really indicate some out-of-band quality. Blanks (like historical "nones") are too casually used as actual placeholders to serve these purposes.

4 Likes

Here is some of what Carl had to say on the topic in his post UNSET! is Not First Class (5-May-2010)

I understand that Ren-C's "unset state" ("TRASH!") can be "lifted"

>> obj: make object! [x: 10, y: ~, z: 20]
== &[object! [x: 10 y: '~ z: 20]]

>> obj.y
** Panic: OBJ.Y is TRASH!

>> lift get:any $obj.y
== ~

The lifted state is a quasiform, and this fits into the other "out-of-band" states in the system.

And I've read Carl's notes above, too.

But I still don't see why exposing this state to the user is better than just keeping it as an internal flag, and always generating an error?

Having used Red, I think the simplest answer would be to drop UNSET! entirely, vs. trying to find ways to manipulate it. A system without unset exposed to users--at all--seems like a safer system.

This comes up every couple of years or so. Getting rid of variables being able to have an "unset state" doesn't help...it only hurts.

Let's say I define an object with some function as an optional member, and the only options you give me are "it's not there" or "it's there and it's none". I get the choice of having references either error like I never defined anything at all... or if I mention it in the unused state, the none will pass right by in the evaluator without an error. Not good.

And what of things like this?

for-each [x y] [1 2 3] [
   probe :x
   probe :y
]

Does it make sense for error messages to not be able to distinguish between the case of not being defined at all, ever... or holding a value that makes this conflate with if the block had been [1 2 3 #[none]] so you can't tell the difference?

HOWEVER... this highlights that getting rid of "unset!" as a state that can occur in blocks is VERY helpful. That was actually one of the first things Ren-C did (and Red would benefit greatly from taking at least this step).

(Note that I call the unset state "trash!" because I think it makes more sense to say "variables are unset" vs. "values are unset".)

Here's a bit of terminology comparison with Rebol2/Ren-C/Red:

"Unset" vs. "Unresolvable"

For first level picks out of objects, Red doesn't seem to make the unset/unresolved distinction in the same way Rebol2 or R3-Alpha do:

red>> obj: make object! [set/any quote x: #(unset)]
== make object! [
    x: unset
]

red>> unset? :obj/x
== true

red>> unset? :obj/asdf
== true  ; Rebol2 and R3-Alpha error here

(Note: Red's SELECT is inconsistent here: (select obj 'x) gives an unset, while (select obj 'asdf) gives none)

But if you were to extend the path you'd get an error, which I would say comes from "lack of resolution"... so at least that's still "unresolved".

red>> unset? :obj/asdf/jkl
*** Script Error: asdf is unset in path :obj/asdf/jkl

Clearly, You Can't Get Rid of "Unresolvedness"!!!

Hence when people speak about when wanting to "get rid of unsetness", what they actually want is:

  1. turn situations where you would "legitimately want an unset" into unresolvedness

  2. declare any other situation "illegitimate/unnecessary" and use a NONE!.

By definition, nothing you can put in a variable can directly denote unresolvedness. All you can do is query about resolvability. If you get a state back and store it in a variable, then what you store is either conflated or a "meta-representation".

To survey the options, imagine:

  • UNRESOLVABLE? => gives you a logic
  • GET-AS-NONE-IF-UNRESOLVABLE => a conflating operation
  • GET-AS-BLOCK-IF-RESOLVABLE => some meta-operation

So:

 >> unresolvable? 'obj/asdf/jkl
 == true

>> get-as-none-if-unresolvable 'obj/asdf/jkl
== none

 >> get-as-block-if-resolvable 'obj/asdf/jkl
 == none

 >> obj/x: 1020

 >> unresolvable? 'obj/x
 == false

>> get-as-none-if-unresolvable 'obj/x
== 1020

 >> get-as-block-if-resolvable 'obj/x
 == [1020]

 >> obj/x: [10 20]

>> get-as-none-if-unresolvable 'obj/x
== [10 20]

 >> get-as-block-if-resolvable 'obj/x
 == [[10 20]]

 >> obj/x: none

>> get-as-none-if-unresolvable 'obj/x
== none  ; conflated

>> get-as-block-if-resolvable 'obj/x
== [#(none)]  ; not conflated

(Ren-C is more elegant w.r.t. meta-operations, but conceptually similar.)

You might feel vindicated looking at that, to think "getting rid of unset" makes sense. Because you're in the same boat whether you have unset or not. Queries and meta-representations are always going to be the tools of last resort.

So why introduce another tier in this hierarchy for "resolvable, but initialized to trash"?

Technically Possible To Drop Unset, but Flying Blind

The people who promote the "no unsets, only unresolveds" would say that if you can enumerate something (such as an object) that everything a key handed back maps to should be friendly... e.g. at least a none. Your binary choice is: it's there and it's initialized to something that you can access without raising an error, or it's not there to find in the first place.

Being able to define an object with slots in it--including slots for functions--where those slots hold "ornery" content on WORD!-fetch is an extremely useful invention. So useful that Ren-C made its TRASH! able to hold a message, so you can be more descriptive about why something is unset.

"Defined but not holding a meaningful value" is useful. You shouldn't throw the baby out with the bathwater... stop the unset state from getting into blocks, but keep it as a useful tool for variables.