Parsing Alternates: Should "Must Match To End" Be Considered?

hostilefork · June 26, 2022, 1:17pm

A more general idea might be a shorthand for [<end> |], a new kind of alternate.

Right now || is taken for another purpose which is useful...to effectively shift all rules to the left into their own block. I've called it the Inline Sequencing Operator

["a" | "b" || "c"] <=> [[ "a" | "b" ][ "c" ]]

We could have -| for saying match only applies if it's at the end of data, and |- for saying a match only applies if it's at the beginning of data. :-/ Or =| and |=. Or the infamous "flags".

circled: lambda [block [block!]] [
    parse block [return [
         thru into group! [<any> <| (fail "Circle One")]
         maybe [thru group! (fail "Circle One")]
    ]]
]

One of the risks of doing this with a new kind of alternate is that it would wind up used at the end, stuck to the last thing:

 [word! <| word! word! <|]

Something could be picked as a symbolic shorthand for end, and used more places:

 [word! # | word! word! #]

We could reverse the doubled-blocks semantics and say a doubled block explicitly means keep running alternates if the end is not matched:

 [[word! | word! word!]]

@rgchris has historically asked that PARSE not enforce reaching the end as a default, so perhaps:

>> parse [a b] [word!]
== a

>> parse [a b] [[word!]]
; null

It's a never-been-suggested use of the @block type, that doesn't really go with anything else:

>> parse [a b] @[word!]
null

I think I still would favor the idea of having it be implicit from the kind of parser situation you're in, and using [[...]] to break out of the rule. But I will have to try it. Just wanted to throw a few more options out.