Do We Need GHOST!, Or Do Invisibles Another Way?

hostilefork · August 1, 2025, 6:35pm

Over time, new capabilities come online... new insights are formed... and it's good to go back to old designs and review them.

So let's start by asking: Is GHOST! Really Necessary?

What Is GHOST!'s Value Proposition?

While conceived for COMMENT, that doesn't actually get much use. The main value of GHOST! by far is ELIDE...something that has side effects and yet discards the result. In particular, its role in PARSE, but it's useful kind of everywhere.

>> parse "aaabbb" [some "a" elide some "b"]
== "a"

ASSERT is also very nice, to do in the middle of CASE statements, etc.:

case [
   condition1 [some code]
   condition2 [more code]
   assert [something true if we got here]
   condition3 [more code]
   condition4 [more code]
]

I'm unwilling to give up the benefits of these "invisible" constructs for the sake of simplification. They're just too powerful.

Why Are VOID (empty PACK!) and GHOST! Distinct?

It used to be that empty PACK! had GHOST!'s vanishing behavior.

It was fewer parts. But it did mean that a length-0 PACK! would break the pattern of what PACK! in general did.

>> "abc" pack [1 2]
== \~['1 '2]~\  ; antiform

>> "abc" pack [1]
== \~['1]~\  ; antiform

>> "abc" pack []
== "abc"

That isn't particularly pleasing.

Splitting the intents out allowed VOID to fit better in with its family, and solidly serve its role as the "opting out":

>> thing: null

>> opt thing
== \~[]~\  ; antiform (pack!, void)

>> append [a b c] (print "For instance..." opt thing)
For instance...
== [a b c]

If VOID vanished, it couldn't do things like that.

Slightly Lame: `()` Has Been GHOST! and not VOID

Evaluating an empty BLOCK! or GROUP! has been giving you a GHOST!, and not a VOID:

>> ghost? eval []
== \~okay~\  ; antiform

>> ghost? ()
== \~okay~\  ; antiform

It would be rather neat if () produced VOID, especially because synthesizing a void right now requires you to either say ~[]~ or ^void

 >> replace [foo baz foo bar foo] 'foo ^void
 == [baz bar]

The empty group would be slicker, and fairly semiotic:

 >> replace [foo baz foo bar foo] 'foo ()
 == [baz bar]

Given new rules about non-^META void assignments, it would also provide a cool way to unset a variable:

var: ()

But under today's mechanics, () has to be GHOST!...not VOID...given that (comment "hi") is GHOST!. It's the "translucent" initial state.

Major Bummer: Vanishing Generalized Evaluations

The biggest problem with inventing a vanishing state is when it vanishes unexpectedly.

To recap this oft-talked-about point, imagine this fairly innocent code:

^result: (code: codemap.key, eval code)

Now imagine that eval code produces a GHOST!. You want that ghost to be stored in result, but instead it vanishes. So you end up assigning code to the result (since it's the result of code: codemap.key).

We could try and solve this with something like eval:ghostable, such that if you don't provide the :GHOSTABLE refinement then EVAL either panics or provides a non-ghost placeholder value if it synthesizes a ghost.

>> eval []
== \~#ghost~\  ; antiform (some substitution, here a TRASH! rune)

>> eval:ghostable []
== \~,~\  ; antiform (ghost!)

I don't like the EVAL:GHOSTABLE idea...

...because this is a general problem that affects any operation that returns ghosts sometimes... and I think it needs a general solution, not having to add a :GHOSTABLE refinement to every function that fits the pattern.

So my thinking would be that functions like EVAL would be written as true to themselves... returning a GHOST! if they meant GHOST!. But then if the type signature doesn't suggest "ghosts always", some distortion would happen. And you'd move the GHOSTABLE to the outside somehow.

This inversion winds up looking something like:

>> eval [comment "HI"]
== \~#ghost~\  ; antiform

>> ghostable eval [comment "HI"]
== \~,~\  ; antiform (ghost!)

While I used a TRASH! as a placeholder there to sort of "help hint at what was lost", it could probably be argued that the closeness of VOID and GHOST! means you'd be better served by getting a void substitution... it might not be as obvious what went wrong, but a lot of times it might not even make a difference.
- A flag could be stored on the void to say it was a ghost substitute, and reported in the console just as well as a trash could signal it to you.
I used a GHOSTABLE function, but this might not be able to be a normal function... because it has the exact properties of the function it's trying to "fix"... e.g. it returns GHOST! sometimes, but not always.
- We're running out of succinct non-WORD!-operators, but for the sake of argument I'll pick ^#

Taking on those adjustments:

>> eval [comment "HI"]
== \~[]~\  ; antiform (pack!) "void"

>> ^# eval [comment "HI"]
== \~,~\  ; antiform (ghost!)

Now let's look back at our "oblivious" situation, did it get helped? Seems you have two options now:

^result: (code: codemap.key, eval code)

^result: (code: codemap.key, ^# eval code)

If you were trying to get a GHOST! into ^result, neither of these are going to help you. But at least the non-^# version had a shot at working (if VOID was interchangeable with GHOST! for your purpose). And the code didn't do something bonkers by default.

We can imagine that the rules would be relaxed, if you rewrote your code as:

code: codemap.key
^result: eval code

The direct targeting of a ^META assignment could have an implicit ^# in the mechanic, to accept a GHOST!. But then, what about the overall expression?

(code: codemap.key, ^result: eval code)

Kind of back to the same question again, if setting propagates the value.

In Such A World, () Defaulting as VOID Makes Good Sense

You could use the same operator to ask to remove the safeguard:

>> (comment "HI")
== \~[]~\  ; antiform (pack!) void"

>> ^# (comment "HI")
== \~,~\ ; antiform (ghost!)

This would give parity in a world where the default disposition of eval [comment "hi"] was to give back a void for safety:

>> eval []
== \~[]~\  ; antiform (pack!) "void"

>> ^# eval []
== \~,~\  ; antiform (ghost!)

We'd basically just be saying that a GROUP! kind of crosses the line--as a synonym for EVAL--into "some black box that doesn't always return a GHOST!"

What About Plain Old `^var` When It Holds GHOST! ?

It seems it would probably be beholden to the same rules as function calls that may-or-may-not return a GHOST!.

Note that your standard "top-level" evaluation step in a multi-step expression isn't "ghostable". Contrast with--say--an argument to a function:

>> ghost? ^ghost
== \~okay~\  ; antiform

There's no reason to make a single-step evaluation to fulfill a function argument non-ghostable. Ghosts only vanish in multi-step evaluations (the vanishing happens in the "evaluator executor", not the "stepper executor").

So you could also do direct assignments:

^var: ^ghost  ; transfers the ghost

But in a multi-step-operation which is accruing an evaluative product... if you didn't bless it with ^# it would turn into a void:

 >> 1 + 2 ^ghost
 == \~[]~\  ; antiform (pack!, void)

 >> 1 + 2 ^# ^ghost
 == 3

My instincts here is that ghost-to-void safety is better than panicking.

(Note this is not "decay", it's an distinct evaluator mechanic that's built into the multi-step expression evaluator, e.g. EVAL as opposed to EVAL:STEP. This is also the point where it decides whether to promote an ERROR! to a panic, if the error wasn't consumed as an argument to anything.).

What About The `~,~` Quasiform Evaluating To GHOST! ?

I question the wisdom of making an exception for ~,~ to evaluate and act like a function that only returns GHOST!

Consider generated code like:

 eval compose [some stuff (lift ^var)]  ; and var happens to be a GHOST!

It's a fine line. If you had composed in code that was like comment "hi" that would be one thing, but synthesizing a lifted value from an arbitrary expression feels less specific.

Cases like this seem like they should have protection against vanishing, and you need to explicitly mark "vanishing okay" if vanishing is truly okay:

 eval compose [some stuff ^# (lift ^var)]  ; potential vanishing expected

But protection is not guaranteed if it's a function:

 eval compose [some stuff (spread [comment "hi"])]

That compose'd-in COMMENT is allowed to vanish without needing a ^# to confirm it, because that's the baseline behavior we want from invisible functions.

This Looks Promising, Especially `()` As VOID

Hopefully it all makes sense, that "functions that only return GHOST!" get the vanishing exemption... and that exemption is not extended to ^META-fetches or ~,~ quasi-evaluations.

(The alternative would be to say you always have to use ^# to decorate things that vanish, functions included. At which point, we might as well just make ^# the vanish operator.)

Note it doesn't make a difference here if you are using a quasiform that makes ghost, or the product of a function that only returns ghosts... so you'd see this:

>> 1 + 2 ^var: comment "same behavior"
== \~[]~\  ; antiform (pack!, "void")

>> ghost? ^var
== \~okay~\  ; antiform

>> 1 + 2 ^var: ~,~
== \~[]~\  ; antiform (pack!, "void")

>> ghost? ^var
== \~okay~\  ; antiform

It does mean ghosts would have a tendency to get dropped on the floor. But while making GHOST!-returning functions and combinators should be easy, writing ghost-aware code (UPARSE or the evaluator itself) is going to involve some special skills.

hostilefork · August 7, 2025, 11:55pm

This thread asks: "Is GHOST! Really Necessary?", so it's worth debunking something I say above:

Obviously it would be awkward-looking, if the only invisible operation was ^# as ELIDE:

>> 1 + 2 ^# print "caret-pound becomes elide"
caret-pound becomes elide
== 3

But it's not just awkwardness. There really does have to be a state of invisibility, otherwise vanishing cannot be done in a clearly identifiable step...only merged in with a prior or next step. Being forced to choose one or the other may break semantics.

I won't write it all out again. See the origin of GHOST! for the full story.

But the punch line is something like this:

>> case [
        1 = 1 [print "branch"]
        ^# print "reached here first :-("  ; imagine ^# as ELIDE
        1 = 2 [fail "Unreachable"]
    ]
^# got left as [print "branch"]
reached here first :-(
branch

The invisible operation needs its own discrete step. That step has to have a product.

So long as a GHOST! state exists, you might as well use it... and not force simple ELIDE or ASSERT invocations to be decorated with the ^#.

hostilefork · August 1, 2025, 7:21pm

In the above analysis, I looked at the balance between two different needs:

Parenthesizing expressions that vanish:
```
>> 1 + 2 (comment "hi")
== 3
```
Having the comfort of things like eval code not vanishing out from under you by default:
```
^result: (code: codemap.key, eval code)  ; imagine code is [] or similar
```
It seems way too slippery to wind up with ^result: (code: codemap.key) being the behavior due to code you didn't anticipate returning GHOST!

Default sanity wins in my mind... requiring you to intervene with some operator to make generalized multi-step evaluations vanish, vs be "safely" converted to a void (and void may be close enough to a GHOST! for most contexts anyway):

>> 1 + 2 (comment "hi")
== \~[]~\  ; antiform (pack!) "void"

>> 1 + 2 ^# (comment "hi")
== 3

Why The Hideous `^#` Operator, Not A Function?

The argument against a function (like ghostable or vanishable) was:

this might not be able to be a normal function... because it has the exact properties of the function it's trying to "fix"... e.g. it returns GHOST! sometimes, but not always.

But might it be that there's a special function flag, which indicates that "if it says GHOST!, it means it"?

In practice the codebase suggests some functions may fit this rule.

Consider PROBE... doesn't it seem like PROBE of a GHOST! should mirror the ghost, vs be "safely" changed into a VOID just because probe can return arbitrary non-ghost values in the general case?

>> 1 + 2 elide print "hi"
hi
== 3

>> 1 + 2 probe elide print "hi"
hi
\~,~\
== 3  ; isn't this what you likely want?

Or there might be functions similar to LET, where let x would vanish, but let x: 10 would be 10, and let x: ^ghost should be safely turned into void unless you said vanishable let x: ^ghost

Vanishing GROUP!s Would Be Hard For Functions...

This may not seem that hard to hack in a function property for:

>> ghost-if-even 1
== 1

>> ghost? ghost-if-even 2
== \~okay~\  ; antiform

>> <test> ghost-if-even 2
== \~[]~\  ; antiform (pack!) "void"  ; safety effect

>> <test> vanishable ghost-if-even 2  ; use specially flagged function
== <test>

But it would have to be a strange variadic in order to have the property of working with GROUP!s, because the GROUP! would be evaluated before it ran:

>> <test> vanishable (1 + 2)
== 3

>> <test> vanishable (comment "oops")
== \~[]~\  ; antiform (pack!) "void"

Functions you'd generally expect to receive the GROUP! product, and plain GROUP! evaluations don't produce ghosts anymore... they'd have to be modified somehow.

Complications In UPARSE Drive My Concerns, Here

I'm trying to keep all the very nice UPARSE demos working.

So there's a lot to balance, in terms of enabling people to write their own coherent "safe" behaviors in their dialects...hopefully building easily on top of mechanics that are available in the evaluator.

This is definitely challenging to design correctly, so I'm pulling all the thoughts together here on this thread.

hostilefork · August 30, 2025, 3:17am

I don't love the idea of such things, but there is precedent (I'm looking at you, infix )

And honestly, the same set of problems come up in UPARSE if you want infix combinators. The attribute has to be marked and reflectable and something that is understood.

There are those who feel Rebol having infix is a mistake of its design. I'm not one of those people, and I think that adding "GHOST! means vanish--not voided for safety in multi-step evaluations" may just be one of those... things.

I'm honestly looking for things to cut for a minimum viable product. I really am.

But I'm way too happy with ELIDE and its unique bretheren to let them go, so... I think this is just another function property in the vein of infixness. I've been trying to find ways to guess it from the return type but I don't think that's the most composable way of doing it... UPARSE needs to know if a combinator means to vanish or not, and it doesn't want to reinvent the mechanism. That's what makes what I'm doing different... I want users to be as powerful as the native authors.

So I think the right way to look at this is to probably do a couple of infix combinators, and look for the commonalities in mechanism.