Making Red Tests Useful: Starting UPARSE on the Right Foot

hostilefork · May 16, 2022, 10:39am

In light of the recent revelations regarding DID and DIDN'T, it was time to promptly dispose of the question-mark-bearing UPARSE? and PARSE?.

But rather than just blindly replace all the UPARSE? from the tests with DID UPARSE, I decided to do something very labor-intensive...and codify what the expression actually evaluated to.

So if a test was something along the lines of:

assert [uparse? "aacc" [some "a" some "c"]]

I went and actually made it say something more like:

assert ["c" == uparse "aacc" [some "a" some "c"]]

For all the mind-numbingly redundant tests from Red, this was no picnic, and involved like a thousand hand-made changes. (Really, so many of the tests are formulaic and should be produced by scripts...but something in the Rebol DNA makes people write out 1 = 1, 2 = 2, 3 = 3 all the way up to 100 = 100 instead of finding a way to do the test generation dynamically.)

I figured so long as I'd done everything else, I'd incorporate any new tests in the past year and a half.

How many, you ask? Just two commits. Here's one:

--test-- "#4863"
	--assert parse to-binary "word" [word!]
	--assert parse to-binary "   word" [word!]
	--assert parse to-binary "123" [integer!]
	--assert not parse to-binary "123.456" [integer!]
	--assert parse to-binary "    123" [integer!]
	--assert parse to-binary "hello 123 world" [word! integer! word!]
	--assert parse to-binary "hello 123 world" [word! space integer! space word!]

And here's the other...which deleted a test:

	--assert error? try [parse #{}[collect into x4197 []]]   ;-- deleted

But added this:

	--assert parse #{}[collect into x4197 []]		;-- changed by #4732
	--assert x4197 == #{}

They're either nearing perfection, or there's not enough sophisticated usage being explored to generate compelling tests. It's anyone's guess which.

Ren-C doesn't believe in INTO, so...

Really it's only the first set of tests that applies. This is where it allows you to name DATATYPE! when you are parsing a BINARY!. (We can do it for strings, too...at the same price...thanks to UTF8-Everywhere!)

Like I say, it's good for UPARSE tests to be more explicit and test more than just "it succeeded", so here's that spin:

[https://github.com/red/red/issues/4863
    ('word == uparse to-binary "word" [word!])
    ('word == uparse to-binary "   word" [word!])
    (123 == uparse to-binary "123" [integer!])
    (didn't uparse to-binary "123.456" [integer!])
    (123 == uparse to-binary "    123" [integer!])
    ([hello 123 world] == uparse to-binary "hello 123 world" [
        collect [keep ^ word!, keep integer!, keep ^ word!]
    ])
    ([hello 123 world] == uparse to-binary "hello 123 world" [
        collect [keep ^ word!, space, keep integer!, space, keep ^ word!]
    ])
]

Their test checks to see that parse to-binary "123" [integer!] succeeded, but there's no guarantee you actually got an integer out of the process. Or if you did, that it's the integer 123.

Bringing more formality to that--and leveraging UPARSE's results--is what this is about.

(Note: Still working on the story for whether you're allowed to KEEP a WORD! without some modifier like ONLY or meta, so deep thought on that forthcoming...)