Catching Mistakenly Discarded Values, Arity Bugs, Etc.

I've been bitten a lot over the years when a value that has no side effects just gets dropped on the ground.

For example: say I added a branch to an IF but forgot to add an ELSE (or change it to EITHER):

>> if 1 = 2 [print "then"] [print "else"] print "Boo!"
Boo!

The else didn't get printed. And from experience, I know that mistakes of this form can be surprisingly hard to debug.

I finally got around to adding a simple feature that catches this... sometimes:

>> if 1 = 2 [print "then"] [print "else"] print "Woo!"
Woo!
** PANIC: Non-discardable value discarded: [print "else"]

WHY "SIMPLE"? (AND WHY "SOMETIMES"?)

An interesting aspect is what discardability turned out to not be: it's not some CELL_FLAG_NODISCARD bit that gets piped around on all cells, that everyone in the universe starts to have to triage and forward.

Internals don't get involved. A function result is simply not discardable if its interface says PURE, and discardable otherwise:

>> index of [a b c] print "INDEX-OF is PURE"
INDEX-OF is PURE
** PANIC: Non-discardable value discarded: 1

>> all [1 2 3] print "ALL isn't PURE: not using 3 isn't an error"
ALL isn't PURE: not using 3 isn't an error

So when you evaluate a whole block or a whole function, there's no concept of that being discardable or not. It's an implementation detail of eval stepping, not something that escapes as a flag on full eval products.

The implication is that there are things it just doesn't catch. e.g. if a value drops out as a function result of a non-pure function, there's no alert if you don't use it:

>> if 1 = 1 [if 1 = 2 [print "then"] [print "else"]] print "Boo!"
Boo!

A little unfortunate, but... we can't ask the internals of a function to do bookkeeping on some invisible flag. Especially a flag that--by definition--gets cleared when you assign it!

There are FAILURE! antiforms for this... but they are for exceptional cases, and you only deal with them in error handling.

If "nodiscard" carried the baggage of FAILURE!, it would be like asking all code everywhere to pay the triage tax as if they were dealing with exception handling...just for this discard safety!

(And it would require giving up a precious cell flag, for CELL_FLAG_DISCARDABLE, as well.)

Nope. It's best kept simple like this.


DESPITE SIMPLICITY, THIS DESIGN DOES CATCH BUGS

For instance: all the branches below in some test code were intended to return a string, but the last branch is missing a SPACED:

case [
    expected-id [
        spaced ["did not error, but expected:" (mold quasi expected-id)]
    ]

    result = '~null~ [
        "test returned null"
    ]

    quasi? result [
        "test returned antiform:" (mold:limit result 40)
    ]
]

So it caught that "test returned antiform:" was just dropped on the floor.


It Does Disallow "Ignoring Inert Values As A Feature"

I've had some weird ideas in the past. Think of for example a modification of EITHER that lets you label the branches, for instance to say which is taken more often (along the lines of [[likely]] and [[unlikely]] in C++20)

my-either condition [<unlikely> ...code...] [<likely> ...code...]

Skipping the tag silently might be considered a feature. :thinking:

But when all is considered, I believe that forcing constructs to skip the tags before executing is better practice.

You might wonder why it waited until after "Woo!" was printed to inform you that the result wasn't used. This is because of vanishing.

If the next statement disappeared, the result might drop out to whoever was consuming it (the console, in this case):

>> if 1 = 2 [print "then"] [print "else"] elide print "Vanish!"
Vanish!
== [print "else"]

But if something non-vanishable comes after the vanishing, you get the error again:

>> if 1 = 2 [print "then"] [print "else"] elide print "Vanish!" 300 + 4
Vanish!
** PANIC: Non-discardable value discarded: [print "else"]

I have used this feature

  • for blocks as a quick way to comment out big chunks of code during debugging, and

  • for strings as an experiment to get "live" comments which could be inspected during runtime.

I am not sure how these usages weigh against finding the class of errors you mention above.

1 Like

So I found that I had also used TAG! to get middle-of-line "comments" in the bootstrap process:

depends: reduce [
    either user-config.main [
        gen-obj user-config.main (<directory> null) (<options> [])
    ][
        gen-obj file-base.main (join src-dir %main/) (<options> [])
    ]
]

This particular use of labeling arguments is anachronistic considering APPLY exists...which lets you name parameters or not (if you pass required parameters in order):

depends: reduce [
    either user-config.main [
        gen-obj // [user-config.main, directory: null, options: []]
    ][
        gen-obj // [file-base.main, join src-dir %main/, options: []]
    ]
]

(Commas there are optional.)


How About: Exempt TRASH! From Nodiscard Rules

Given how few functions take TRASH! parameters... maybe there's no need to report when it gets dropped on the floor:

depends: reduce [
    either user-config.main [
        gen-obj user-config.main (~<directory>~ null) (~<options>~ [])
    ][
        gen-obj file-base.main (join src-dir %main/) (~<options>~ [])
    ]
]

This way, if you have something light to say--to where a full on COMMENT or ELIDE in an evaluative context would be too verbose--a quasi-TAG! gives you a pretty wide range of expression. A quasiform tag even can put > and < inside. ~<<like><so>>~

That's a compelling possibility... to say that you don't mind "silently throwing out the TRASH!"... because the odds you were actually trying to pass it somewhere are very low.


Okay, This is Where "SuperTAG!" May Come In...

I've pointed out how with string interpolation, we need to match up open and close things to get the necessary structure.

If it didn't enforce those rules, then you would not be able to do things like:

interpolate <The thing is (if 1 > 2 [thing])>

The > has to not end the TAG!.

This means that you really could use ~<[...]>~ to "comment out arbitrary code", because inside the TAG! it has to enforce the rules of the scanner!

So long as the tag is obeying a nested scan, you've got your commenting tool... if we allow it to discard.

It won't vanish, but if vanishing was important to you THEN you would use COMMENT or ELIDE or whatever.

(But note you'll be storing the whole thing as UTF-8 source code but not structure. If that bothers you, use COMMENT [...] and it will transcode the block.)

Seems to be a great idea. :slight_smile:

I'm not following...

Once the interpreter clears the stack and knows the arguments weren't passed to a function call or word/path assignment, why can't it realize it's not by design?

Sure, it's possible for IF to know. But this knowledge has to cross the barrier from "inside IF's implementation" to "outside of IF". I'm arguing that it's a bad idea to go down the slippery slope of building that communication pipe.

If you do that, you're saying that the machinery of all functions do some grinding and give you back a pair of things: the Cell representing the return value, and a flag as to whether it is discardable.

(And the inevitable optimization is to say "make the flag a bit in the Cell header".)

Now, imagine a complex function with lots of branches and intermediate variables trying to do the bookkeeping to know whether to return this flag. That bookkeeping is going to depend on proxying the flag off of component operations. So you've manifested a "concern" into the world with this bit...and it gets nasty for native code, and nigh impossible for usermode code. Considered holistically, I think it doesn't work out to be worth it.

So I'm preferring to say this concern is laser-focused in the mechanics of eval-stepping, and doesn't escape to the rest of the system.

Arguments might be made that "just because it would be a mistake to fully generalize doesn't mean you couldn't have natives sneak out a flag here and there." Perhaps my hard-line architectural rule is throwing out some small helpful things that a simple native could do. :man_shrugging:

But... having implemented it in a way I find satisfying, I don't like that direction. My gut reaction is to say it's better to live with the hard rule.

But why should if bother? The interpreter itself knows that that block didn't go into the if arguments. I think the interpreter alone has enough knowledge to trip this case up: no function call frame is open, but a passive (block) value is being pushed onto the stack - that's not
good.

But BLOCK! is a valid IF product... this is a (currently) idiomatic way to make a block:

if condition [[data block]]

My point remains: in order to make a decision about whether that block gets used or not, you have to communicate a flag across the IF function call barrier. I've laid out my misgivings about that flag.

However... we could make the assumption that plain BLOCK! is never valid if not an argument to a function. So I could force you to say either:

if condition [$[data block]]  ; if you want bound in current context

if condition ['[data block]]  ; if you want unbound

And Ren-C does allow shorthands that dodge making an evaluator level altogether (literal branching):

if condition $[data block]

if condition '[data block]

So the loss of [[data block]] as a branch wouldn't be all that catastrophic....

But now this is a new evaluator flag to track... EVAL_FLAG_OUT_NO_DROPOUT. So BLOCK! would not set the EVAL_FLAG_OUT_IS_DISCARDABLE, but would set that flag. While $BLOCK! and 'BLOCK! would set neither...hence allowing dropout.

I'm skeptical this would be worth it. Introduces another nuance in the implementation...and makes you put decoration in source where you might not want it.

It feels too prescriptive (and if it seems too prescriptive to me, most other people will almost certainly think so).

I get it now.

So [data block] can be a return value. And the outer function can also use it as a return value. And so on.

We're not always using return values: we may be uninterested in them, and don't want to always explicitly state that disinterest. And as long as there was a side effect, we can't assume the return value was the goal and warn about losing it.

So if I write:

if 1 = 1 [print "A" [some block] [other block]]

...then [some block] can trip the warning, but [other block] can't. Right?

1 Like

Yup.

I think this is the correct tradeoff.

The bad news is: it doesn't catch everything.

The good news is: it doesn't pollute the world with some transitive concern or require a cell flag.

EVAL:STEP might offer to return the flag for those interested... but that would be as far as it leaked out.