Exact Matching of Variables with the @ Types In UPARSE

hostilefork · August 2, 2021, 6:58pm

I mentioned that the @ types were slated for use for matching the contents of a variable exactly. The most frequent example I have given is:

>> block: [some "a"]

>> uparse [[some "a"] [some "a"]] [some @block]
== [some "a"]  ; success gives result of last matching rule

So that's different than [some block], which would treat block as a rule.

Works with all types:

>> num: 1

>> uparse [1 1 1] [some @num]
== 1

I didn't mention things like @(gr o up) but those work too:

>> uparse [1 1 1] [some @(3 - 2)]
== 1

I realized I actually do not know how to write the above two cases in Red or Rebol2. You can't use the number as a plain variable in Red, since it acts as a repeat rule (UPARSE prohibits that, since it's a rule that takes an argument, you must use REPEAT for such behavior)

red>> num: 1

red>> parse [1 1 1] [some num]
*** Script Error: PARSE - invalid rule or usage of rule: 1

Also in Red, I'm not clear on why the following isn't an error, since the GROUP! product is just discarded:

red>> parse [1 1 1] [some (3 - 2)]
== false

This is something that would work in R3-Alpha, but doesn't in Red or Rebol2:

red>> parse [1 1 1] [some quote (3 - 2)]
== false

Your guess is as good as mine. Whatever the answer in their world is, it's not obvious. But I think the @ types give a clean answer in UPARSE.

But What About @[bl o ck] ?

We might say that it means match a block literally:

>> uparse [[some "a"] [some "a"]] [some @[some "a"]]
== [some "a"]

That would be wasteful, since we already have a way to match blocks literally by quoting them:

>> uparse [[some "a"] [some "a"]] [some '[some "a"]]
== [some "a"]

But UPARSE has changed the game for why @[...] and [...] can mean different things...because block rules synthesize values. And who's to say you might not want to match a rule and use its product as the literal thing to match against?

>> uparse [1 1 1 2] [@[some '10, (10 + 10) | some '1 (1 + 1)]]
== 2

In other words your rule can match and provide an answer for the thing to match next. We have zero experience with how often that might be useful, but maybe it is?

hostilefork · November 8, 2024, 8:06pm

I think I've finally decided to declare @ to be a shorthand for "match any item at this position", and not take an argument.

This combinator's "long" form is called ONE (a replacement for historical SKIP, because reading [x: skip] and expecting that to store an item in a variable sounds like the opposite of skipping... and also, SKIP being arity-0 doesn't fit with the rest of the system... UPARSE instead has a SKIP combinator that takes how much to skip):

>> parse [#foo <bar>] [issue! one]
== <bar>

So now, it simply has a shorthand:

>> parse [#foo <bar>] [issue! @]
== <bar>

To justify why this isn't a fully arbitrary choice: when we see something like @var that's matching at the current position under the constraint of the provided variable:

>> block: [some "a"]

>> parse [[some "a"] [some "a"]] [some @block]
== [some "a"]

So it doesn't seem too crazy that when you take away the variable name that's being looked up for the constraint, you'd get a combinator that matches anything.

For Quoting, There's JUST and LITERAL

This means @ doesn't behave like it does in the main evaluator as an arity-1 operator for literalizing the subsequent argument.

But you have other options. JUST will "just" synthesize the value (don't match it), while LITERAL will match it (and synthesize if matched).

>> parse [] [just x]
== x

>> parse [''x] [literal ''x]
== ''x

LITERAL is nice when the thing you are matching has more than one quote level, because otherwise it can feel a little confusing:

>> parse [''x] ['''x]
== ''x

It's also nice if something has a quote mark in the name:

 >> foo': "foo prime"

 >> parse [foo'] ['foo']  ; hrrrm
 == foo'

 >> parse [foo'] [literal foo']
 == foo'

As a shorthand, there's LIT.

 >> parse [foo'] [lit foo']
 == foo'

hostilefork · November 16, 2024, 1:51pm

One downside I discovered of using a SIGIL! for "match anything" is that if you try and apply that more broadly outside of PARSE, you run into trouble if you're going to try using some sequence itself as a matching template.

For example, if you wanted a.1.2 to match against @.1.2. The @ isn't in the first position, it's a decoration on .1.2

Of course, there's going to be some trouble no matter what you pick... if it's legal to occur in that position, then you have to deal with the case that it's literally there.

But if it were * then it would at least afford:

*.1.2  ; matches a.1.2

['*].1.2   ; matches *.1.2

['[*]].1.2  ; matches [*].1.2

etc.

This is sort of a tangentially related thing, because if you try and apply the logic of PARSE to this matching scenario, then pretty much everything has to be in a block to quote it literally.

So whatever this "sequence-globbing" domain is, would be different.

Also, given that I'm talking about something that doesn't exist, what would * really mean?

 a.b.c.1.2  ; would this match *.1.2 but not ?.1.2

If we were to say that PARSE needed to bow to this, then it kind of suggests that ? would be "match any one item".

Anyway, just making the point here... that SIGIL!s are slippery. If PARSE is trying to set a precedent for a systemic recognizable idea of "match single item" then maybe it shouldn't be done with a sigil.