Strict Equality, Lax Equality, Equivalence, Sameness

hostilefork · March 22, 2025, 6:05pm

It's been many years, with only a small amount of movement on this...

But it's long past time to kill off ==.

I Think It Has Emerged That Plain = Should Be Strict

For some time now, Ren-C has been case-sensitive in terms of binding. It was a change I made to see how it would work out... and I haven't seen any evidence that it hasn't. So I don't know if this is going to be reversed.

I've written about the best defense of case-insensitivity I could think of, but even having done so... there are a lot of problems associated with it.

If case-insensitive binding were to come back, it is likely that all casing would need to be canonized to the lowercase form, in binding-relevant places, e.g.:

>> make object! [Foo: 1020]
== #[object! [foo: 1020]]

But there's a lot of downside to that, and you're looking at incompatibility with JSON and basically any other modern language you might find you like to model. I think it's been more of a liability than an asset (again, having played devil's advocate pretty strongly with my writeup).

Anyway, even in the unlikely event of case-insensitivity returning to Ren-C binding... I still am fairly positive: EQUAL? (=) should mean equality of casing and equality of datatype.

LIKE/UNLIKE Seem A Decent "Relaxed Equality" Pair

LIKE is not an insane amount of typing for when that's your intent. And UNLIKE may look a little weird, but it's not that bad:

>> "a" unlike <a>
== ~null~  ; anti

You could always write if not "a" like <a> [...] if you find the UN offputting.

There's a bit of existing understanding from SQL of LIKE as being more akin to "globbing", which would be:

>> "abc" like "a*"
== ~okay~  ; anti

But we use a lot of words differently and aren't owed a debt to that. And if we have globbing, there's no reason to not just call it GLOB.

>> glob "a*" "abc"
== ~okay~  ; anti

How Alike Counts As LIKE?

Well...there's being different types, but having the same content:

>> "a" like <a>
== ~okay~  ; anti

Then there's being the same type, but having different casing:

>> "FoO" like "fOo"
== ~okay~  ; anti

I'm skeptical of the usefulness of an operator combining both of these being terribly useful... when does this really come up?

>> <FoO> like "fOo"
== ~okay~  ; anti

Despite being the default Redbol (or at least Red) behavior for equality, it's rare enough that I don't mind if it's not particularly easy to express.

(In the few cases I don't want to test for exact equality...) ...it is overwhelmingly more common for me to want to compare unlike types with the same content, than I want to do case-insensitive comparisons. Biggest example is wanting to see if foo: and foo match spellings very often...but the more into dialecting you get, you have cases like foo: and <foo>

It's nice when an operator takes the guesswork out of the efficiency of conversions. Consider if you have a TEXT! you want to compare with a WORD!. The "smart" way to do that is to alias the WORD! as a TEXT!:

(as text! word) = text

That's smart because due to the way the internals work, it can use the same memory allocation that backs the UTF-8 of the word and aliasing it as a read-only TEXT!. If you did it the other way:

word = (as word! text)

You're having to hash the text to look up if there's a symbol registered for it, and it creates the symbol if it's not there. So you're paying extra for the lookup -and- possibly creating a useless symbol table entry for the text. But word like text could just be naturally efficient under the hood.

As someone who really wants to advocate for the interests of dialect authors, I've spoken about how important it is to be able to push values around between the different "parts of speech" to get them out of band of one another without losing information. And this ties into that. I fully expect the surrounding code to have already established what the types are:

 if (tag? item) and (item like wordtable.(n)) [...]

But Then, What About Case-Insensitivity?

It seems a bad plan to pick other "wishy-washy" words such as x similar y to mean potentially different cases... with x like y to mean potentially different types.

One new tool in the box is that infix functions can now have refinements. And we could imagine both EQUAL?/= and LIKE?/LIKE having a refinement that affords case insensitivity.

According to the AIs (and my life experience) there is no standardized term for differently-cased variants (in the vein of words like homograph or synonym). Technical documents seem to go with:

Case-variant or casing variant — if you want a concise and fairly technical phrase.
Case-insensitive match — if you want to highlight that the relationship depends on ignoring case.
Case-folded equivalent — if you want to imply they are equivalent after normalizing the case.

I once called differently-cased variations of WORD! "synonyms". But with the binding becoming case-insensitive, they are particularly -not- synonyms any more. And that term was specifically for WORD! and not longer TEXT!. (Would you call "the quick brown fox" and "The QUICK brown fox" synonyms, anyway?)

In the spirit of folded, there is also canon.

>> "foo" =:canon "FOO"
== ~okay~  ; anti

CANON-EQUAL? is better than FOLDED-EQUAL? (when I think of folding in programming I'm thinking of left folds and right folds as higher order functions, not "folding cases").

There's :UNCASE or :NOCASE or :UNCASED (I think it's too random to say :CASE means "ignore case")

>> "foo" =:uncase "FOO"
== ~okay~  ; anti

>> "foo" =:uncased "FOO"
== ~okay~  ; anti

>> "foo" =:nocase "FOO"
== ~okay~  ; anti

I've used :RELAX often to ask for less strict versions of things:

>> "foo" =:relax "FOO"
== ~okay~  ; anti

>> "FoO" like:relax <fOo>
== ~okay~  ; anti

While that doesn't convey casing, we might argue this is a more generalized relaxing... in the case of EQUAL? meaning "relax on anything you can about this, that isn't the type".

Maybe these are common enough to deserve a shorthand of some kind:

>> "foo" =* "FOO"
== ~okay~  ; anti

>> "FoO" like* <fOo>
== ~okay~  ; anti

Or:

>> "foo" *= "FOO"
== ~okay~  ; anti

>> "FoO" *like <fOo>
== ~okay~  ; anti

Or:

>> "foo" ?= "FOO"
== ~okay~  ; anti

>> "FoO" ?like <fOo>
== ~okay~  ; anti

The AIs of course suggest using ~= but I'm very hesitant to allow tilde in WORD!s at all. I feel like quaisforms "own" tildes, and I'm not crazy about quasi-words like ~~=~ But it seems like it would be technically possible to say that as long as your tildes are asymmetric in a WORD! and you hit a delimiter before another tilde, it's a WORD!

>> "foo" ~= "FOO"
== ~okay~  ; anti

>> "FoO" ~like <fOo>
== ~okay~  ; anti

I think my instincts are screaming a litlte too strongly not to use tildes this way, and let quasiforms own them completely... so if you're eye is ever drawn to a tilde (that's not in a string) you are looking at a quasiform.

Raised Errors Could Do... Something Here?

There's a germ of an idea in one of the first thoughts I had about LIKE. That first thought was that it would be for testing casing variations...but if you tried it with two different types you'd get an error:

>> "a" like "A"
== ~okay~  ; anti

>> "a" like <a>
** (Raised) Error: Not even the same type

And if that's a raised error then you could TRY it if you want to say "I meant to do that, but want a falsey result if they're not the same type..."

>> try "a" like <a>
== ~null~  ; anti

Then, the option to :RELAX this could relax the types.

>> "a" like:relax <A>
== ~okay~  ; anti

It's interesting, BUT... this version of LIKE:RELAX is something I never want (caseless comparisons of differing types). Plus it makes LIKE something I don't use all that often (comparing two strings caselessly).

Yet that thought experiment did give rise to the idea of prescriptively disabling equality comparisons of floating point numbers completely, but allowing you to apply approximating operators, hence "still use =". Since that only came to my mind today, it's hard to say how much potential it really has... it does go hard, e.g. [1.0 foo bar] = [1 foo bar] would raise an error just for trying to compare.

But I do think this should probably be part of LIKE. If you say 'foo like 1020 I don't think we're doing anyone any favors by not raising an error... INTEGER! could never have an equivalence with WORD!. I don't know if 1020 like [1020] is within the intended range of application, but... maybe it is?

How Will (A LIKE B) Deal With CHAIN!/etc?

I mentioned needing an easy way to check if foo: and foo line up without having to coerce their types (again, generally assuming we already know for instance that one is a set-word and the other is a word).

But there's a lot more devils in the details these days. foo: is a CHAIN! now (not a fundamental SET-WORD! type). The reversibility requirement of TO has introduced some equivalence-class problems. (to word! 'a:) and (to word! ':a) can't both be reversible. This throws a bit of a wrench into the idea of being able to say that a is "like" either :a or a:

Right now there's a function UNCHAIN which will give you a from either :a or a:. And RESOLVE is something I've been using as a more general thing, that can give you a.b from more complex things like /a.b:. Might we say that LIKE is more resolve-like than it is a to-equivalence-like ?

>> (first of [/a.b:]) like (first of [a.b])
== ~okay~  ; anti

On the surface this feels more useful, but it starts to make LIKE seem as if it's gotten feature creep and gone way out of control. If you mean x = resolve y maybe you should just say that.

But this narrows LIKE a fair bit, to where foo won't be like foo: after all... and as I sugested foo like 1020 is probably better raising an error than being falsey, it would likely be best to raise an error comparing WORD! with CHAIN! if they could never actually be "like".

So There's Some Long... Thoughts

I had hoped to tie this up with sparkling clarity, but should have expected that it's still a can of worms. However, the dimensions of the can are bounded better and there's a sort of an approximation of how many worms are in it.