Killing Off Historical "SKIP" in PARSE

SKIP suggests you're not using the result. Yet historical SKIP doesn't really do that:

rebol2> parse [1] [set x skip]
rebol2> x
== 1  ; How was this "skipped" exactly?

After contemplation of many possibilities for what this might be (including *, or <*>, or ?, or just plain period (.), I settled on <any>.

>> parse [1] [x: <any>]
>> x
== 1

I'm quite happy with it--especially in light of removing ANY as a looping combinator from the default combinator set. It brings ANY to its coherent systemic meaning of ANY-ONE-OF... as opposed to ANY-NUMBER-OF.

But @IngoHohmann suggested that SKIP might be related to ELIDE. So I tried:

>> uparse "ab" ["a" skip]
== "a"

This would make SKIP equivalent to ELIDE <any>.

But SKIP as ELIDE <ANY> IS RARELY USEFUL, AND CONFUSING

If you want ELIDE <ANY> just write that.

The general skip takes an argument of how much to skip, and having a PARSE analog to SKIP that takes no argument is just confusing.

It may be that SKIP taking an argument is worth having:

>> uparse "aaab" [skip (3) "b"]
== "b"

But then we have to decide what SKIP returns:

>> uparse "aaab" [skip (3)]
== ???

And you can say that particular case with 3 <any> and it's shorter.

Either Way, the Historical Use Is Pointless

So SKIP is not in UPARSE. Maybe it will have a reinvention some day with a new meaning.

Seems to me, that skip would be the natural name for your elide.

2 posts were split to a new topic: OMIT vs ELIDE

Since I'm making a new bootstrap executable to facilitate FENCE!, I'm patching it to modern behaviors for PARSE3. This includes the behavior of SKIP as being arity-1 and taking how much to skip.

I'm quite pleased with most UPARSE redesigns, including that one. But the substitute for "match any single item" being the <any> tag gave me pause to ask "does this feel right, in light of modern understandings?"

I think I've come to believe that TAG! combinators are all arity-0, which <any> is. But most other tag combinators synthesize things from midair without advancing the input. Not advancing the input might be a good rule for a tag combinator?. Perhaps that's over-restrictive.

Might it be better if it were just one?

>> parse [1020] [x: one]
>> x
== 1020

It might suggest a class of combinators that captured more items (two, three, etc.). Hence they would be a worse choice than e.g. skip 2 if you didn't want to synthesize the values, so hopefully people wouldn't reach for them unless they really wanted blocks. But that would create an irregularity that ONE wouldn't make a block when the others did (if ONE gave back a block, it means the captured thing was itself a block). Or maybe it's a PICK where the series is implicit?

>> parse [1020 304] [x: two]
>> x
== 304

(That would make it seem like it should be FIRST and SECOND instead of ONE and TWO, but I don't particularly like FIRST for this... I don't think.)

I think I like ONE. It's easy to type, and I think it seems more meaningful than <any>. Especially when there is an ANY combinator that does something more related to the non-PARSE ANY.

We could also have an arity-0 NEXT combinator that could return the next position, if you were trying to convey more that you were just skipping ahead and not intending to take any input.

2 Likes

Seems simply that TWO would give back... a SPLICE! (if using a block, a string if a string or blob if binary blob)

>> parse [1 2 3 d e] [three two]
== \~(d e)~\  ; antiform (splice!)

>> append [a b c] parse [1 2 3 d e] [three two]
== [a b c d e]

Ooooh. Don't know how high one needs to go. The PICK specializations go to TENTH. :thinking:

This makes sense to me.

So if you didn't assign the result of old-SKIP to anything, you use NEXT. If you did assign it, use ONE.

If you think about SKIP as its series analogue, it suggests synthesizing the position after the skip:

>> parse [a b c d] [pos: skip 2 to <end>]

>> pos
== [c d]

So I think I've decided SKIP and NEXT should synthesize the adjusted position.

(It might seem that SKIP 2 is "cheaper" than TWO because it doesn't synthesize a new series, but that's a drop in the bucket compared to using a parameterized combinator vs. an unparameterized one. So TWO would be way more efficient, if you were in a situation where you're not using the result so either would suffice to skip two things.)

1 Like

As we've come to think more about iteration, I've deemed that AT seems like it should be an arity-1 operation that accepts iterators and gives you the thing at the position. It would also accept series as stand-ins for iterators.

Series operations in PARSE take the input implicitly, so they drop an argument from their arity (so NEXT is arity-0, and SKIP is arity-1 to take just the skip count, etc...)

This makes AT seem like a shoe-in for the "give me the thing "at" the current position operation"

>> parse [1020] [x: at]
>> x
== 1020

It's coherent and it's short.

"But AT Doesn't Suggest Changing The Series Position"

Neither does SKIP.

I've proposed ideas like a convention for passing the series in and automatically getting the update:

>> block: [a b c d]

>> skip block 2
== [c d]

>> block
== [a b c d]

>> skip $block 2
== [c d]

>> block
== [c d]

This would happen with AT as well. If so, then you would understand PARSE as being analogous to the "mutating" versions that pass the variable to update, e.g. the series position.

Note that if you don't want to actually do the advance but still get the current item, you can say:

x: ahead at

But if we decided to implement the "TAG! combinators don't advance the position" convention, that could be the meaning of <at> as well.

We Can Still Have ONE, TWO, THREE...

...in fact it means that ONE can become a consistent member of the family, by returning a SPLICE! containing one item.

They could even be generic operations for making splices out of series:

>> two [a b c d]
== |~[a b]~|  ; antiform (splice!)

Hence not taking a series argument would make them fit the overall pattern of dropping arity by one.

I'll point out that if you're not going to use the item at the current position, this gives you at least 4 ways to say "skip exactly one thing"

[... skip 1 ...]
[... next ...]
[... at ...]
[... one ...]

Since ONE synthesizes a new Stub node for a BLOCK! you don't need (for the SPLICE!) then that's a wasteful thing to GC. Though, it's a drop in the bucket.

Much bigger deal is that SKIP 1 takes an argument, which adds overhead... running the INTEGER! combinator and supplying a parser to SKIP (though it's the kind of thing that can be optimized).

So NEXT and AT are the cheaper choices. It's definitely weird to think of AT being used purely for the purposes of advancement when you aren't using the value, which is a bit of what makes it uncomfortable even when you are using the value.

"What About @ for AT" ?

It might seem like previous questions about @ being used for "get the current item" would be strengthened by this.

But I've been reluctant to give away decorated things like @foo to iterator operations in the language as a whole, because it seems like typing at foo is short enough. There are too many uses that are more interesting.

So I doubt [x: @] would be a synonym for [x: at] in PARSE.