What Could Be A Shorthand for `inline (opt rule)` ?

Today, rules that look up to NULL will PANIC in parse--right at the moment of fetching the word:

>> prefix: null, suffix: ")"

>> parse "aaa)" [prefix, some "a", suffix]
** PANIC: (prefix is null, and we raise errors for that in parse)

If we didn't panic, there would really only be two other options:

  1. Make null always succeed, keeping the parse position where it is (synonym for [])

  2. Make null always be an unsuccessful combinator match, but not cause a failure (synonym for VETO)

The behavior I'm looking for in this post is optionality, which would correspond to the "always succeed" behavior. (e.g. if there is no prefix, so I want to skip worrying about matching it). I feel it is pretty obvious that should not be the default behavior for encountering a null variable.

(If null variables did anything other than error, they should probably do (2) and not match... but I don't think that's very wise compared with panic'ing.)

Optionality From Null Today: inline (opt rule)

If you don't care about whether a variable you're going to use as a rule is "falsey" or not, you've always had an option of empty block (and now we have an even more flexible option with empty splice, a.k.a. NONE!)

INLINE will take the synthesized product of the following rule (usually a GROUP!) and use it like a rule--as if it had been COMPOSE'd into the stream of operations.

If you inline a NULL, that's presumed to be an accident so it panics. But if you inline a VOID, that is considered a no-op, so it does nothing. Hence if you OPT a NULL to get a VOID and inline that, it will succeed with no effect:

>> prefix: null, suffix: ")"

>> parse "aaa)" [inline (opt prefix), some "a", inline (opt suffix)]
== ")"

It works, but it's wordy. :-/

One can wonder about a combinator which was more succinct to accomplish the same thing, without the parentheses and without the OPT.

Since we don't have a name for it, let's call it PERHAPS for a moment:

>> prefix: null, suffix: ")"

>> parse "aaa)" [perhaps prefix, some "a", perhaps suffix]
== ")"

It seems like a close parallel to OPT in the evaluative world, where NULL is turned into VOID with other things passed thru...

...but PARSE's OPT is entrenched as being about optionality of a rule that exists matching... not optionality of the rule itself. So this is a fundamentally different intent.

PERHAPS's Problem: We Can't "COMBINATE" NULL

Unless quoting is involved, the combinator that would be getting the NULL from a word lookup initially would be the WORD! combinator.

So when a parser gets produced from a WORD! combinator, when that parser is called by another combinated parser it will PANIC. It can't return an ERROR!, because that would just be interpreted as a rule that didn't match.

So the only way I can see a null-tolerant PERHAPS fitting in would be that it would have to quote its argument, so the WORD! combinator didn't get involved. It would then do the WORD! fetch itself, turning into a failing combinator if it fetched null.

That may seem to work, but...

Compositional Problems With PERHAPS

Let's say you wanted this:

"if there's a prefix, match some non-zero number of instances, but if prefix is null then don't worry about matching":

INLINE can do it:

>> parse "aaa)))" [
       inline (if prefix '[some prefix])
       some "a"
       inline (if suffix '[some suffix])
   ]
== ")"

You can actually leverage VETO and COMPOSE here, to get the COMPOSE to cut itself short:

>> parse "aaa)))" [
       inline (opt compose [some (if not prefix [^veto])])
       some "a"
       inline (opt compose [some (if not suffix [^veto])])
   ]
== ")"

You can shorthand that to get something brief that doesn't need to repeat PREFIX/SUFFIX. But it would still be kind of long.

But what if we tried to do that with the hypothetical PERHAPS...could it work?

>> parse "aaa)))" [some perhaps prefix, some "a", some perhaps suffix]
; infinite loop!

There's a problem here. Because perhaps prefix just succeeds and doesn't advance the input when prefix is null. But if you combine that with some the null case will just match nothing in perpetuity, causing an infinite loop.

This may look familiar, because if you write some opt [...anything...] you'll always get an infinite loop. But in that case it's just wrong thinking: you know that the repetitive nature of some looking for an eventual non-match meant you must have intended some [...anything...] (at least one) or opt some [...anything...] (zero or more).

:thinking:

NOTE THAT HISTORICAL PARSE HAS NO GOOD ANSWER FOR THIS

Rebol2 treats NONE! as a no-op which just succeeds but doesn't advance the input. So the following gives you an infinite loop:

 rebol2>> prefix: none suffix: ")"

 rebol2>> parse "aaa)))" [some prefix some "a" some suffix]   
 ; infinite loop

The hackish "must make progress" rules in R3-Alpha actually make the above "work as intended", because the SOME will bail out after one non-advancing match. I don't consider that a "good" answer--more a random effect.

some perhaps Is Semantically Perilous

Parsers can only succeed or fail right now. So for some perhaps to work, you'd have to be able to say something else: "opt out above me as far as you can, but be fundamentally successful".

How far would that make sense to go? It would certainly have to stop at the BLOCK! combinator to be of any use. But what makes the BLOCK! combinator special to squash the "bubble up opt-out success" idea?

The reason that INLINE can work with a cobbled together rule is because it can be specific about where the point of opting-out should stop. If PERHAPS is nested inside and trying to signal that outward, it doesn't work.

Is PERHAPS Still Useful Enough To Make?

A quoting combinator that glosses over null rules seems useful--even if it can't be used in composition. Though given that OPT is taken for something much more common and relevant to parsing, it's hard to give it a good name.

The alternative is to use a [] or none (~()~ antiform ) rule (if that's clearer/faster) as the state of your rule when it's not applicable. But the whole thing is that you might want to use null for the state for its conditional falseyness for other parts of your situation.

Perhaps it isn't necessary. But it's been on my mind a while, and I wanted to write it up.

1 Like

Wild But Awesome Idea: ^META Splicing

Let's imagine that anything ^META is inlined, but inlined by the BLOCK! combinator itself.

This would give it powers that a combinator would not have.

>> keyword: either 1 = 1 ['some] ['opt]

>> parse "aaaaa" [^keyword "a"]
== "a"

You could SPLICE! in partial combinator rules:

>> parse "aaaaa" [^(either 1 = 1 ~[repeat 5]~ ['opt]) "a"]
== "a" 

What this buys you over a COMPOSE is that it runs the code each time it processes the rule. So it can integrate knowledge accrued in the parse, and won't eval/inline any ^(groups) that it doesn't hit as relevant to the match.

Got a variable that's Null, to either opt out or use as a rule?

Easy peasy.

>> prefix: null, suffix: ")"

>> parse "aaa)" [^(opt prefix) some "a" ^(opt suffix)]
== ")"

(Of course you could say prefix: none now and the empty splice would "just work", but you might want null for other reasons.)

Shorthand For Numbers

When INTEGER! was moved to being literal in PARSE, that meant you had to start putting it into a group to use it:

>> count: 2

>> parse "aa" [repeat (count) "a"]
== "a"

But if you could bring in arbitrarily rule fragments with ^META, you could say that a little more succinctly:

>> count: 2

>> parse "aa" [repeat ^count "a"]
== "a"

That's of non-trivial benefit. And it's good coverage...

>> count: 2

>> parse "222" [repeat 3 @count]
== 2

We still have $ free, but here's an idea... what if it was the variable content itself, not matched in the stream?

>> word: 'bar

>> parse [foo foo] [some 'foo $word]
== bar

That would be a synonym for (word), but could be useful... in terms of being faster and not competing with potential uses of parentheses for composition, etc.

Anyway, back to the ^ proposal...

This May Require Exposing the FEED! Mechanism...

We don't want to copy the rule block unnecessarily if there aren't going to be any splices made into it. So it would be nice if there were some sort of "on-demand" data structure that corresponded to a stream of evaluation, into which pieces could be spliced as you go...

But hey, we have that. It just isn't exposed.

Internally to the evaluator, FEED! is a way of maintaining a notion of a token stream into which injections can be made, and it sees one token at a time. This is how things like INLINER work.

UPARSE could build on that same mechanic. And things a debugger would use to introspect the feed to show you the current expression and such would be no better or worse than they would be for the evaluator itself.