What deserves to be a datatype?

bradrn · March 5, 2024, 8:20am

This is that thread.

I’ll begin by observing that in Rebol, the complexity of the lexer vs the parser is ‘reversed’ compared to other programming languages. In Rebol, the actual syntax is highly minimalistic: there’s only a few constructs which provide explicit grouping, and none provide anything more than a simple list of items. By contrast, the lexer is exceedingly complicated: nearly every datatype has its own literal form, oftentimes more than one.

Language design ends up ‘reversed’ in a similar way. In most languages, discussion centres around questions like ‘which new syntactic constructs should we add’. By contrast, Rebol (and especially Ren-C) more often poses the question: ‘which new datatypes do we want to include, with which literal syntax?’.

At the moment, I still feel uncomfortable discussing such questions. I don’t feel that I fully understand the kind of criteria we should consider to know whether a datatype is worth including or not. Or, more concisely, I don’t understand how decide: what deserves to be a Ren-C datatype?.

One obvious criterion is simply, datatypes representing common types of data. This is why we have things like MONEY! and FILE! and DATE! and so on. Ultimately this stems from Rebol’s heritage as a data-transfer format, but obviously these types are far more broadly useful.

Another obvious criterion is syntax which is important for programming. This gives us GROUP! and GET-WORD! and PATH! and so on. These exist as datatypes ultimately because Rebol is homoiconic, but their presence has suggested a wide range of uses beyond simple programming.

This accounts for most of the types in Ren-C. And, if that were all to it, I’d have no objections.

But, unfortunately, there are some other types, whose presence is explained by neither of those criteria. As I’ve said previously, the ones which make me feel most uncomfortable are THE-* and TYPE-*. Neither of these represent common types of data that one would want to pass around. And, with the possible exceptions of THE-WORD! and TYPE-BLOCK!, they’re basically useless in ‘regular’ programming.

Despite this, @hostilefork has lobbied pretty hard for both of these. Hopefully it should be clear now why I find this viewpoint confusing. I can’t say the existence of these types is problematic, as such, but I feel this indicates a gap in my understanding of the language.

The closest to an explanation I’ve found is that these types are useful in dialecting. That is, they may not be useful for programming per se, but having the syntax around is useful for constructing new languages. (For instance, using TYPE-WORD!s in PARSE dialect, or THE-WORD!s for module inclusion.) The problem with this is, as we’ve established, that there’s a huge number of syntaxes which would be ‘useful in dialecting’: clearly, this is too low a bar for deciding ‘what deserves to be a datatype’.

(And, incidentally, this also establishes that we’re quite willing to reject datatypes that don’t seem to be of sufficiently general usage.)

Another argument is simply consistency: other sigils have versions for words, blocks, tuples, etc., so THE-* and TYPE-* should as well. But this doesn’t strike me as particularly convincing — there’s nothing intrinsic in Ren-C which requires sigils to generalise to all possible types. Indeed, we’re quite willing to avoid doing so when it would make no sense. (For instance, we don’t have ISSUE-TEXT!, ISSUE-BINARY!, ISSUE-EMAIL!… we just have a single textual ISSUE! type, because doing otherwise would be silly.)

So, when all is said and done, we have a set of types which don’t seem to be of general use, and have no convincing reason to exist, but are nonetheless kept in the language. And I want to know why that is, because I can’t figure it out.

hostilefork · March 5, 2024, 2:11pm

A Few Opening Thoughts...

@rgchris wrote the tagline of the 2019 conference as:

Rebol • /ˈrɛbəl/ “It’s about language […] We believe that language itself, when coupled with the human mind, gains significant productivity advantages over traditional software technologies.”

He's elsewhere given the high-level bullet points of:

Data/messaging come first, and words without qualifying symbols [are] the premium currency.
Source should as much as possible resemble the molded representation of said source loaded.

So Rebol's spiritual inspiration is more-or-less English. In that sense, it's important not to drift too far into "symbol soup" in the general practice. There's probably some sweet spot of the percentage of what your dialects can do with WORD!...and that percentage should be high. (So regardless of the merit of the underlying ideas in Raku, it's a poster child for what we don't want common Rebol code to look like.)

It is certainly possible to not like the premise, e.g. Joe Marshall rejects it. And you can pick what at least look like bad examples easily. But if he stopped and reflected on the reflexive and fluid mode his mind was in while writing the paragraph of English critiquing the Rebol, he should grok that what's being pursued is to put you in that same "zone". We're trying to tap into that Language Instinct (X-bar Theory) that research has shown we all carry in our heads.

So with all this in mind, it's important to realize that Rebol embraces its outgrowth from 10-fingered creatures and QWERTY keyboards, vs. fighting that. Since the inspiration is English, it's an inevitable outcome that it's going to be at odds with the kind of clean and orthogonal model sought by languages which draw their inspiration from other places (e.g. math).

@BlackATTR has said: "[Rebols] are a bit like the family of soft-body invertebrates in the language kingdom. Their special traits don't necessarily shine in common computing domains, but... On a second level I think Domain Specific Languages remains an open frontier largely uncracked by the hidebound languages who originally mapped the territory.

There isn't a "no silliness rule" in effect. What curbs the existence of things like--say--SET-ISSUE! is trying to balance competing meanings, and pick the most useful one.

rebol2>> type? first [#this:is:a-valid-issue]
== issue!

Given that : is legal internally to the ISSUE!, that's one of the points guiding us toward ruling out SET-ISSUE! (as well as ruling out issues as words more generally), and favoring that a colon at the end is literally part of the content.

Note that if colons are internal to things that look word-ish, they are actually URL! (more specifically, "Uniform Resource Names")

>> type? first [urn:isbn:0451450523]
== url!

Silly or not, apostrophes are legal inside words:

rebol2>> type? first ['abc'def]
== lit-word!

rebol2>> to word! first ['abc'def]
== abc'def

I actually don't think that's silly, because I want the words.

>> find "abcd" "bc"
== "bcd"

>> did find "abcd" "bc"  ; logic coercion
== ~okay~  ; anti

>> didn't find "abcd" "bc"  ; DIDN'T is a sensible complement to DID
== ~null~  ; anti

if x is 10 [...]  ; there is debate on what this means, but I won't digress
if x isn't 10 [...]  ; natural complement

Terminal apostrophes lead to a weird looking thing when quoted, seemingly enclosed in quotes like a string type. But my desire to be able to have "name-prime" or "name-double-prime" style words like foo' or foo'' (especially for variables that hold lifted states) makes me tolerate the consequence of 'foo' and learn to read it correctly... though I'll choose @ foo' or the foo' instead when writing in source.

What does this imply for edge cases, like lone apostrophe? Rebol2 and Red call it illegal. Ren-C had a usage as quoted void which was dropped, so it's back on the table.

As documented here recently, my own philosophy on how far we should be willing to go with WORD! has faced reckonings. The introduction of SIGIL! solved the problems with wanting @ and ^ as words, and said "no, they should not be words" and went in a new direction with that.

We've gone into the reasoning for why $foo: does not exist, and why arrays are the "API" for letting you pick these apart as $[foo:] or [$foo]: etc. So again it's nothing to do with avoiding silliness... it's mechanically motivated, with me simply not knowing how to implement the underlying bytes and a pleasing API for destructuring it otherwise.

I am not by-and-large concerned about things that aren't competing for the same lexical space. If there needs to be special code in the lexer to rule something out that works fine otherwise, then in my calculus, throwing in a branch with an error "has a cost" and is (some) addition of complexity.

The error branches are there to stop ambiguities and align with the rules of the implementation--not to keep parts out of your hands.

If that sounds like it's not enough of constraint, well... it is a big constraint. Ambiguity-wise, Ren-C has a nearly-total lexical saturation for ASCII (in as far as its rules permit). It's less ambiguous than Rebol for the most part, though it introduces some of its own... e.g. what is ~/abc/~... is that a quasi-path! or a plain PATH! with lifted-trash in the first and last positions? (Right now this is a leading argument for ruling out path isotopes, because I want tildes in paths more than I want antiform or quasiform paths.)

Anyhoo... my advice to you would be to get a bit more experience in the medium...and "Find The Game", as we say in improv. It's of course perfectly valid and desirable to scrutinize the design and the datatypes. But if you had found the game, then I think you'd see these aspects as more of a tangential detail when weighed against bigger design issues... plus be more targeted in what things needed critique.

For myself, I'm bothered by things like:

>> $1
== $1.00

I have little use for the MONEY! type, and when it can't serve correctly as DOLLAR-INTEGER! it becomes basically completely useless to me.

Some of these questions... like preserving quote-style string vs. non... register on the needle in ways that trying to assassinate THE-BLOCK! or TYPE-TUPLE! do not... especially when I have compelling applications for them.

bradrn · March 5, 2024, 3:09pm

It’s late here, and that’s a fairly comprehensive post, so it’ll take me a while to absorb it fully. I’ll continue thinking over it until I reach some more definite conclusion.

But, until then, here’s my immediate thoughts:

You’re right; in the scheme of things, it isn’t a significant objection. But from a personal perspective, it’s important for me: it’s a place where I clearly haven’t understood the language, and that annoys me.

~~(Also, I suspect you misunderstood me. I singled out THE-BLOCK! and TYPE-BLOCK! as two types which do have compelling applications. It’s the other types in their family which confuse me.)~~

[EDIT: I got confused there, see below]

This is one of the big things I haven’t fully absorbed, I think. ‘You should be able to do most things using WORD!’ is a good summary of the aims.

As it happens, I have a (quite intense) side interest in linguistics. By and large, I strongly reject Chomskyanism, including the ‘language instinct’ idea. As for X-bar theory, that was more or less a fad which has by now long passed. (The Chomskyanists obsess over Minimalism now, though I’m sure that too will pass.)

Insofar as I subscribe to any linguistic theory at all, I tend to sympathise the most with construction grammar… which, interestingly, strikes me as being remarkably close to how we think about Rebol programs. It’s certainly a closer fit to Rebol than generative approaches are: essentially, ‘building sentences [or programs] out of smaller, idiomatic parts’.

Perhaps ‘silliness’ wasn’t quite the right word here — it’s that same sense of ‘most-useful-ness’ which I was trying to get at. SET-ISSUE! is of minimal use, so it gets trashed in favour of the more useful datatype.

That being said, I hadn’t fully appreciated the extent to which these kinds of collisions between syntaxes was possible. There are more ‘competing meanings’ here than I had thought.

By the way, this is legal in Haskell too. It’s particularly common for making ‘primed’ symbols, which as you note is thoroughly useful.

Sure, and I agree with that, which is why I didn’t object to these in my original post.

I don’t see any problem with this; could you elaborate?

bradrn · March 6, 2024, 1:20pm

Actually… re-reading this, I got confused here. It’s THE-WORD! and TYPE-WORD! which I can see the need for. By contrast, out of all the types we have, THE-BLOCK! is the one which feels most redundant and useless. (And TYPE-TUPLE! and TYPE-BLOCK! aren’t far behind.)

hostilefork · March 6, 2024, 5:20pm

Imagine I decide to use $1 and $2 etc. to be some kind of positional substitution notation in a dialect:

>> substitute [a b $2 c d $1 e f] [<some> <thing>]
== [a b <thing> c d <some> e f]

If I reflect it out to the user in any way, it will carry the decoration I don't want unless I get involved in removing the extra digits:

>> substitute/trace [a b $2 c d $1 e f] [<some> <thing>]
DEBUG: $1.00 is <some>
DEBUG: $2.00 is <thing>
== [a b <thing> c d <some> e f]

Being a headache in that way--and having to decide things like if you round down $1.01 or error--means it's not a fit for such purposes. It wasn't intended to be used that way, but my point was just that it's one of my pet peeves about the type, because I would use DOLLAR-INTEGER! more than MONEY! in the kinds of things I'd use Rebol stuff for.

Where do you find the time for all these interests? Point was just whatever it is that is innate in us to let us see structure in streams of words, we want to leverage that "zone" as I called it. Most languages don't try to go there.

Rebol does so on purpose, letting you organically decide when you want to delimit e.g. with BLOCK! in parse rules ([try some integer!] vs. [try [some integer!]]) or GROUP! in evaluator code. As you get more comfortable with things, the training wheels fall off and you tend to delimit less...or at least much more purposefully. As with English.

One place I see THE-BLOCK! coming into great use is if for a dialect that is not purely mechanical (the way PICK and FOR-EACH are), and leaving the @ off of a block signaling that you would like the "INSIDE / IN" binding to be applied automatically.

e.g. some variation of the CIRCLED dialect might work like so:

>> x: 10 y: 20

>> var: circled [x (y)]
== y

>> get var
== 20

>> var: circled @[x (y)]
== y

>> get var
** Error: y is not bound

That's just an idea. (It's a less obvious idea now that @[...] blocks bind on evaluation more generally, but it shows promise).

I've mentioned the other cues here that are useful, such as when ANY and ALL don't want to evaluate but only to run the predicate:

>> any/predicate [4 + 3 10 + 20] even?/
== 30

>> any/predicate @[4 + 3 10 + 20] even?/
== 4

This is important if someone else ran a reduce step and you still have questions about the data... it keeps you from having to do something like MAP-EACH item to a QUOTED! version just to suppress evaluation in the ALL. It could be done with a refinement, but the single-character notation as a convention--which aligns with not adding binding--seems a salient solution.

So it's far from useless in my eyes. And I will reiterate that an important application of TYPE-TUPLE! is when you have a predicate function inside an object or module:

 >> obj: make object! [lucky?: number -> [number = 7]]

 >> parse [7 7 7] [some &obj.lucky?]
 == 7

And then TYPE-PATH! when your function has refinements.

bradrn · March 6, 2024, 11:26pm

Starting in the middle:

This is simply going back to what I said originally: ‘The closest to an explanation I’ve found is that these types are useful in dialecting’.

But then, if you’re willing to accept that as sufficient reason for syntax to exist… well, you can use that to justify basically any syntax, as indeed I once tried to do:

The lesson I took out of that discussion is that adding syntax purely for usage in dialected code is a bad idea, because it’s impossible to know where to stop.

That being said, beyond dialecting, it is pretty useful to have this distinction of ‘this list represents code, evaluate please’ vs ‘this list is just a list, do not evaluate’. After all, it’s nice to be able to pass around lists without worrying that they’ll be evaluated randomly. And it makes sense to store that distinction on the datatype, rather than as a refinement (cf. /ONLY vs antiforms). So, on balance… yeah, I think THE-* is starting to make much more sense to me now.

Ah-ha, I keep on forgetting that Rebol uses TUPLE!s for access within an object. (In my defense, historical Rebol didn’t use dot-syntax, and neither do some of the languages I use daily, so I sometimes forget it exists.)

Although… then again, this looks like it’s yet another instance of ‘syntax which is only useful in dialects’. That still bothers me, for the reasons I already mentioned.

OK, I didn’t even think of that possibility. That makes sense.

I disagree with it though, for two reasons:

I think MONEY! is actually very useful. Working with money requires specialised requirements (e.g. fixed-point storage), which can be a bit painful — so having that type built-in to the language eliminates a whole class of subtle errors. And, of course, all kinds of software requires working with money.
I think positional substitution is a particularly annoying kind of substitution. I use it in Bash, and hate it. I’d much rather do something like substitute [a b $bar c d $foo e f] {foo: <some> bar: <thing>}, which is less error-prone and more descriptive.

So, I’d rank them the other way around: MONEY! is most useful, and DOLLAR-INTEGER! is less useful.

Admittedly, I would do the same. But for a general-purpose language I think MONEY! is important and very useful.

I couldn’t really tell you; I’ve always just had very broad interests. And linguistics has always been one of my favourite areas.

bradrn · March 8, 2024, 1:39pm

Returning back to my original post…

The conclusion I’ve taken from this discussion is that: no, I actually didn’t have a gap in my understanding of Ren-C as a language. Instead, the gap was that I hadn’t realised how widely useful THE-* is. Now that I understand that, it fits neatly into my criteria for deciding when datatypes are useful to have.

Admittedly, this still leaves TYPE-* without a motivation. (Other than ‘useful in dialecting’, and I’ve already explained why I dislike that.) However, on reflection, the whole type system of Ren-C does seem to be in flux at the moment… to take but one example, currently we only use TYPE-BLOCK!s which are one element long (at least as far as I’m aware). So perhaps the design just needs to be refined a bit, and then it will become something which makes more sense.

hostilefork · March 8, 2024, 5:12pm

Glad you see the applications of THE-XXX! now.

There's an uneasy aspect of how dialects which interact with the evaluator either align with the evaluator or not, and how much to balance that. I've already mentioned "circling"... in multi-return the feature was initially thought of as:

>> [a b]: multi-returning-thing ...
== 10

>> a
== 10

>> b
== 20

>> [x @y]: multi-returning-thing ...
== 20

>> x
== 10

>> y
== 20

(Aside: it really should be fascinating to people that a popular language feature like multi-returning would be implemented in this way... with a part like SET-BLOCK!, that isn't reserved by the system but is free for other designs... and with an "unstable isotope" of antiform block carrying the values, such that it decays to the first element on a normal variable assignment. While Rebol did put the overall weird idea of an evaluator of this style "out there", Ren-C has dialed it up to 11... and this really is where we're talking about the "new artistic medium, unlike anything else".)

But here's a problem: if we take @ generally meaning "use binding, don't create a new one" then it's an obvious choice for let [x @y]: ... to mean "create a new binding for x, but use the existing one for y".

Does a dialect need to follow the evaluator exactly? No. But how divergent it can be depends on what percentage of the evaluator it's going to interact with. (In this particular case, there are already problems with @ not coupling well with wanting a meta result, e.g. [x ^y]:, how would you circle the Y in that case? GROUP!s escape into evaluation to synthesize the variable name, so they're not available. This seems a good fit for FENCE! for circling, e.g. [x {^y}]:, and is the sort of thing driving the "braces are too valuable to waste on a non-array type" mentality.)

`[block]` `•` `(group)` `•` `{fence}`

Another thing I'll bring up is the current meaning for @ in PARSE, "I mean it literally, not as a rule":

 >> block: ["a" "b"]

 >> parse ["a" "b" "a" "b" "a" "b"] [some block]
 == "b"

 >> parse [["a" "b"] ["a" "b"]] [some @block]
 == ["a" "b"]

Is that good or bad? Is there some other meaning that's more related to the evaluator behavior that's getting elbowed out, here? We want this as a keyword, anyway... maybe ACTUAL:

 >> parse [["a" "b"] ["a" "b"]] [some actual block]
 == ["a" "b"]

Having lately brought some of the philosophy points to the forefront, I might have been over-concerned about symbol meanings in dialect...and should retrain my focus onto those mechanically essential aspects, like carrying the sigil to someone who will be looking at it.

As mentioned, that should often be done by a word too, probably:

>> block: [a b c]

>> inert block
== @[a b c]

All of this is mad science, but can be very addictive once you get into it.

Very much so! I'm glad you have a pretty full grip on it (what's there is not complicated, outside of maybe isotopes... but you understand those too).

With the time I have, I'm pushing on a lot of things...FENCE! is one. But I hope the type concept gets a shot in the arm like binding has, which has gotten it out of the stalled state.

bradrn · March 9, 2024, 12:06am

Well… this behaviour is hard-coded in the evaluator, right? So I don’t see how it’s any less ‘reserved by the system’ than the behaviour of all the other vocabulary which Ren-C gives you.

(But also, there’s parallels in other languages: e.g. Haskell doesn’t have ‘true’ multi-returns, but if you pattern-match on a returned tuple, that looks very much like a multi-return.)

VAR-WORD! instead of THE-WORD!. At least in the case of FOR-EACH, it makes sense with the semantics of ‘use existing variable’. I think I can justify it with SET-BLOCK! too: ‘use as main variable’. (It does make processing more complicated though, since FOR-EACH would have to see what has bindings and what doesn’t.)

I thought you said we could handle this case by doing [x @[^y]]: and so on? It’s a little ugly, but logical.

The one major thing I don’t really understand yet is the various kinds of ‘non-valued intents’ (as you’ve put it). But that can go in a separate thread.

hostilefork · October 11, 2024, 3:28pm

A lot can change in... 7 months

hostilefork:

There isn't a "no silliness rule" in effect. What curbs the existence of things like--say--SET-ISSUE! is trying to balance competing meanings, and pick the most useful one.
rebol2>> type? first [#this:is:a-valid-issue]
== issue!
Given that : is legal internally to the ISSUE!, that's one of the points guiding us toward ruling out SET-ISSUE! (as well as ruling out issues as words more generally), and favoring that a colon at the end is literally part of the content.

The emergence of the CHAIN! type changes the landscape, here. Because on balance, it seems we get more expressivity by allowing these ISSUE! (token?) things to be in sequences.

We might disallow them at the head, but it doesn't seem we have to make that disallowance:

>> x: '#foo:bar:#baz

>> type of x
== &(chain)

>> first x
== #foo  ; issue!

>> second x
== bar  ; word!

>> third x
== #baz  ; issue!

>> y: #"foo:bar:#baz"
 
>> type of y
== &(issue)

It does mean that the character literals for sequence interstitials have to be #"/" and #":" and #"." instead of #/ and #: and #. -- but you already had to do that for things like #"[" so I don't know if it's any great argument against it.

Like I said, there's not a no-silliness rule. So if something falls out of the design and it's mechanical and unambiguous and consistent, I lean to supporting it (albeit sometimes grudgingly).

hostilefork:

Note that if colons are internal to things that look word-ish, they are actually URL! (more specifically, "Uniform Resource Names")
rebol2>> type? first [urn:isbn:0451450523]
== url!

...not anymore in Ren-C... that's a CHAIN! too.

Well, now they are words (along with / and // and /// and . and .. and ... etc)

But they're in the class of things you have to use a SET-BLOCK! to set:

>> [:]: 1020

>> :
== 1020

I am leaning to saying that using SET-BLOCK!s is probably the more sane answer than trying to make exceptions for /: to be a CHAIN! with a WORD! in the first slot. The exception shouldn't be made for :: as a CHAIN! with a word in the first slot, and it's too confusing compared to [/]:

Then if you want to assign a function, instead of it being a strange exception you'd have to distribute the leading slash:

/[/]: infix get $divide

hostilefork · May 5, 2025, 7:45am

So @bradrn ... if you're still out there, somewhere... you may be glad to know several types are getting the axe.

We've already discussed the demotion of TYPE-XXX!, once it was decided that trailing slash did a pretty fine job of letting you do type constraints:

>> parse ['abc 'def 'ghi] [some lit-word?/]
== 'ghi

So it became WILD-XXX! with no uses (while it was thought about whether & should be a SIGIL! at all).

Then something fairly major happened, in which things like ^foo.^bar became important... and important to consider as a TUPLE! consisting of META-WORD! elements:

Should sequences permit non-head SIGILs? - #2 by hostilefork

In one fell swoop, that killed off: (VAR-TUPLE!, VAR-CHAIN!, VAR-PATH!, META-TUPLE!, META-CHAIN!, META-PATH!, THE-TUPLE!, THE-CHAIN!, THE-PATH!) as fundamental types.

But as I note about the optimization profile, even if decorations live "under" the sequence type, we still want to sometimes have cases where the interpretation is as if they were on the whole type.

>> $foo.$bar
== foo.$bar

In order for that to be done efficiently, the best implementation is to have the guts under the hood of the Cell act as if there are these distinct "type bytes" (because that's the best place to store the variation). And right now, the most reasonable way to think of this is that the HEART_BYTE() is modulo 64 to get the secret sigil variation... permitting exactly 3 SIGILs: VAR, META, and THE.

So there's no room for the & Sigil in the most straightforward implementation. Paring things down to an even 3 variations.

Obviously I do not love the idea of etching in stone the number of sigils based on what fits in a byte. But ASCII is not going anywhere, and we're at a bit of a saturation point for the medium.

So maybe this will give you some peace in terms of a rationale, especially since your least favorite Sigil is gone.

bradrn · May 9, 2025, 2:12pm

Um… this forum seemed to vanish for a while, and I only just discovered you changed its domain name. Looks like I've missed a lot.

Anyway, this design makes much more sense to me. I knew the system would eventually collapse into something more organised! And, as usual in these cases, it’s only become more powerful in the process.

hostilefork · May 9, 2025, 2:39pm

...When I noticed you hadn't logged in for a while, I sent a mail to your registered address for the forum (Wed, Apr 16, 3:22 PM).

(It should have also been delivered as an announcement from the forum remailer...posts to "Announcements" are supposed to go out by default as notifications.)

The change was due to whoever runs the domain letting it break unannounced, so something had to be done to keep it up.

But welcome back!

I got a new M4 mac, and hunkered down to make bootstrap executables for all the platforms... but I wanted to nail things down as best I could before doing so. (Last bootstrap executables were from 2018, and having that relatively stable has been helpful as the design has undergone its evolution.)

Little did I know that going through that bootstrap executable and pinning things down would lead to a cascade of insights. When you are forced to solve a problem in a way that it works in both an "old" world and a "new" world... sometimes the "compromise" you come up with actually turns out to have properties that are just good on an absolute scale.

The idea of making VOID a purely unstable state--taking the empty antiform PACK! ("none")--and having vanishing done only by antiform COMMA! (now "VOID!") nailed down longstanding problems. It caused a flurry of adjustments, but those adjustments are putting everything in its right place.

It was a revelation to make (^foo: xxx) store a lifted form and (^foo) fetches the unlifted form. It also means we can probably avoid putting trailing apostrophes on the name, because ^foo' and ^foo are redundant, if all assignments and fetches carry the caret. This also is the final piece in the puzzle for how to deal with (for-each [key value] obj [...]) potentially picking up value as an action antiform and running it accidentally via reference: don't let FOR-EACH pick up antiform actions unless you use ^value... and there you have it.

I got over my squeamishness about COMPOSE defaulting to picking up the context from the callsite. So you can use plain old COMPOSE to interpolate strings--and blocks using the callsite context. Then the arity-2 version is COMPOSE2, so you can pass in a pattern that also can carry an environment.

I decided that --[...]-- Strings Look Better than --{...}-- Strings, since we have a choice. But there will likely be stringlike uses for all the asymmetric delimiters when dashed.

Anyway those are some of the biggest things. Lots of little things, and things are starting to move at a breakneck pace... it's actually rather exciting at the moment.

hostilefork · May 9, 2025, 3:09pm

It may be that everything is allowed to carry a SIGIL!

Because it is now useful... for instance... to be able to say:

>> block: [a b c]

>> block.^2: void 
== ~[]~  ; anti

>> block
== [a ~[]~ c]

But rather than calling ^2 a "META-INTEGER!" as a fundamental type, it would presumably be a METAFORM! and you'd use some destructuring operation to get the integer out of it.

The implementation I've gotten for this (multiplexing the 3 sigil options and no sigil into two bits, and cutting fundamental types to 64) doesn't allow for [^ $ @] on quasiforms. But that goes along with my instincts that things with these decorations shouldn't have quasiforms.

bradrn · May 10, 2025, 3:44am

Ah-ha. It went to Junk. (I don’t often get email from people I haven’t previously emailed…)

I never quite wrapped my head around the various kinds of unset states… I’ll have to review that thread!

This one I really like. It feels very much in keeping with the nature of meta forms.

(Except… doesn’t this mean that ^quasi is an antiform while ^ anti is a quasiform? Or am I misremembering what the ^ operator does?)

As for the rest, I don’t really have a strong opinion on any of them, but progress is always good to see!

hostilefork · May 10, 2025, 8:25am

Good memory, and good insight!

You are quite right that this raises questions about what the ^ operator should do.

At first glance it would appear that UNLIFT makes a reasonable amount of sense, so that:

^ some complete expression

and...

^(some complete expression)

...mean the same thing. There's no assignment in effect, so it should be unlift-ing whatever you looked up.

It's a little weird for it to be unpaired, such that there's only a standalone unlift symbol and no standalone lift operator symbol. But there is a fundamental asymmetry in play with the evaluator, because you have to put expressions on the left inside a group to assign them... while you don't need to on the right. So maybe it's just a natural outcome of that asymmetry, to be able to exploit it by dodging some parentheses on the right, which you wouldn't be able to dodge on the left anyway.

It has no meaning at the moment, while I look to see if there's any higher third "non-lift/non-unlift" purpose.

I'm a bit surprised I didn't come to this conclusion sooner. I already had it so that META-WORD!s would automatically "raise" the representation on the left in a SET-BLOCK! assignment.

I guess names just have a powerful grip over our thinking ("it's a META-WORD!, every time it's in an evaluative context you have to META the contents of the WORD!...")

hostilefork · May 11, 2025, 10:26am

It occurs to me that if everything is allowed to carry a SIGIL!, then we will be getting $ on strings.

And the SIGIL!s have been pushed under CHAIN! and TUPLE! etc.

That means you should be able to write:

>> $"REN_C_DIR": "/home/hostilefork/ren-c/"
== "/home/hostilefork/ren-c/"

>> $"REN_C_DIR"
== "/home/hostilefork/ren-c/"

Since ENVIRONMENT! comes from an extension, I don't know exactly how this would get wired into the "RebindableSyntax" mechanism. SET-VAR-TEXT! wouldn't be a fundamental type... $"REN_C_DIR": is a CHAIN! with a VARFORM! (BINDFORM! ?) of a TEXT! in it followed by BLANK!.

But by hook or by crook, I want it. Environment variables are very important in many of the sorts of domains that Ren-C would be good in.

It's a little sad that environment variables don't have a mark on them as being LOAD-able syntax, e.g. strings don't have quotes on them to say they should be interpreted as text. If they did, we could round-trip types:

>> $"REN_C_DIR": %/home/hostilefork/ren-c/
== %/home/hostilefork/ren-c/

>> $"REN_C_DIR"
== %/home/hostilefork/ren-c/

>> $"BLOCKLIKE": [a <b> c]
== [a <b> c]

>> $"BLOCKLIKE"
== [a <b> c]

It may be worth having this be a higher-level layer that assumes environment variables starting with % are files, and starting with [ or ( or { are blocks/fences/groups. I don't know how far it should go. Maybe just the % as file trick is enough.

hostilefork · May 15, 2025, 11:51pm

2 posts were split to a new topic: ([1]: ...) For Unpack, ([...]: ...) Assign Itemwise?

... ↩︎

What deserves to be a datatype?

A Few Opening Thoughts...

[block] • (group) • {fence}

A lot can change in... 7 months

`[block]` `•` `(group)` `•` `{fence}`