Usefulness of String Interpolation

hostilefork · January 11, 2024, 1:39am

To pick a random example from the build helpers for "CScape" interpolation of some generated C code:

emit --{
    #define ${MAYBE PREFIX}INCLUDE_PARAMS_OF_${NATIVE-NAME} \
        $[Items]; \
        assert(Get_Series_Info(level_->varlist, HOLD))
}--

The use of ${} (instead of $() or $<>) means that the result of the expression should be turned into a valid C identifier name... so dashes are converted to underscores, etc.
The use of all capitals in the ${} escaping means that the strings generated by the expressions evaluated should be made all uppercase.
The use of $[] means that items is an array, and its elements should be printed one line at a time...repeating the boilerplate leading and trailing on each line (in this case an indent on the left, and a semicolon and backslash on the right)

The template looks something like the result:

#define INCLUDE_PARAMS_OF_IF \
    DECLARE_PARAM(1, return); \
    USED(ARG(return)); \
    DECLARE_PARAM(2, condition); \
    DECLARE_PARAM(3, branch); \
    assert(Get_Series_Info(level_->varlist, HOLD))

Without interpolation, we fall back on LOAD-able code... where spaces and quotes are required by the language itself. This starts to lose the ability to keep track of actual spaces in the interpolated thing, plus you keep having to start and stop string delimiters on the string portions.

I'm not quite sure how it would come together dialected via regular code, but it would drift away from looking like C code, at best it might look like:

emit [
    "#define " <c> (MAYBE PREFIX) "INCLUDE_PARAMS_OF_" <c> (NATIVE-NAME) " \"
    "    " @[Items] "; \"
    "    assert(Get_Series_Info(level_->varlist, HOLD))"
]

I'd be hard-pressed to say the spacing was correct on inspection. We've lost the intuition about where the unspaced parts are. You can imagine it getting worse when you're building unspaced material inside a string literal. Strings can simply be the least noisy medium when you want to see something that looks close to the result.

Anyway, with strings carrying binding, we wouldn't have to do what we do today... which is actually pass the variables (that don't live in LIB) in a block to emit:

emit [prefix native-name items] --{  ; <-- ack
    #define ${MAYBE PREFIX}INCLUDE_PARAMS_OF_${NATIVE-NAME} \
        $[Items]; \
        assert(Get_Series_Info(level_->varlist, HOLD))
}--

So I look forward to getting rid of that.

Rebol And Scopes: Well, Why Not?

And it’s even easier in Rebol than it is in Haskell, because there’s already a single built-in function to do everything for you:
>> x: 10 y: "foo"
== "foo"
>> print ajoin ["Scopes? " x " " x " " x " " y " " y " " y]
Scopes? 10 10 10 foo foo foo
>> foo: func [x] [local: 20 ajoin ["The sum is " (x + local)]]
>> foo 30
== "The sum is 50"
I strongly prefer this approach over string concatenation, since by using sensible data structures it integrates much better with the rest of the language. (It also reduces the risk of errors from malformed strings, and potentially the equivalent of SQL injection attacks.)

Note that Ren-C has DELIMIT (and UNSPACED, SPACED) instead of AJOIN... which hopefully you'll like even better.

bradrn · January 11, 2024, 7:05am

hostilefork:

To pick a random example from the build helpers for "CScape" interpolation of some generated C code:
emit --{
    #define ${MAYBE PREFIX}INCLUDE_PARAMS_OF_${NATIVE-NAME} \
        $[Items]; \
        assert(Get_Series_Info(level_->varlist, HOLD))
}--

OK, this is a lot more powerful than the string interpolation I’m used to. I can see why you’d want this — it fits in very well with the general idea of dialecting.

(Personally, I’m not at all averse to templating via concatenation, as in this code of mine from two days ago. But Haskell isn’t Rebol.)

Brett · November 5, 2021, 11:11pm

Reminds me of StringTemplate - which I have never used, but thought was interesting.

hostilefork · March 25, 2025, 11:47pm

So now there is INTERPOLATE.

>> num: 1000

>> interpolate "Hello (num + 20) World!"
== "Hello 1020 World!"

It's the "bad boy" environment-capturing primitive, that does what most natives should avoid--sniffing the evaluator's concept of "current context" and reacting to that. (You usually don't want functions to be doing this, they should be reacting solely to their parameters.)

Besides capturing the context, it's a synonym for if you'd written compose2 @() .... COMPOSE2 is what you want use to change the delimiters or do other customizations... just use an @list form to capture the binding and tell COMPOSE2 you want to use the binding of that list:

>> num: 1000

>> compose2 @{{}} "Hello {{num + 20}} World!"
== "Hello 1020 World!"

Taking INTERPOLATE Stratospheric

One of the first things I did was fix the historical behavior of URL!

rebol2>> load "http://(domain)/example.txt"
== [http:// (domain) /example.txt]

So URL! now matches parentheses:

>> rev: 'info.rebol.forum

>> url: https://(reverse of rev)/t/usefulness-of-string-interpolation/2114

>> interpolate url
== https://rebol.metaeducation.com/t/usefulness-of-string-interpolation/2114

That works now, but I also plan to adjust the scanner to let you put parentheses or other delimiters before the :// ...

>> url: transcode "[protocol]://example.com"
== [protocol]://example.com

>> protocol: 'https

>> compose2 $[] url
== https://example.com

This made me realize that my dreams of giving FILE! "teeth" may be actualizable... INTERPOLATE can have special behavior if what you're working with is a file.

What I've always wanted to enforce is that if you put FILE! into FILE! where a slash is, that the thing you put in has a slash:

>> some-dir: %home/

>> interpolate %(some-dir)/something.txt
== %home/something.txt

>> interpolate %(bad-dir)something.txt
** Error: FILE! interpolation slash calculus mismatch: %home/

>> bad-dir: %home  ; no slash, not a dir

>> interpolate %(bad-dir)/something.txt
** Error: FILE! interpolation slash calculus mismatch: %home

>> interpolate %(bad-dir)-something.txt
== home-something.txt

The idea is that you know from the template whether you're introducing a directory or not. This is all to bring sanity to the process--if you don't want sanity, use a plain string and turn it into a FILE! later. I just want "sanity out of the box".

The same rules would apply to PATH! if you're using them as a proxy for files:

>> path-dir: 'home/

>> interpolate %(path-dir)/something.txt
== %home/something.txt

>> interpolate %(path-dir)/something.txt
** Error: FILE! interpolation slash calculus mismatch: home/

I don't know that it would apply to WORD! substitutions, as they are kind of another beast. They can't have internal slashes so they don't have the same kinds of problems, so this is probably legal:

>> word-dir: 'home

>> interpolate %(word-dir)/something.txt
== %home/something.txt

I think TEXT! could similarly duck the rules, as long as it didn't have any internal slashes in the substitution:

>> text-dir: "home"

>> interpolate %(text-dir)/something.txt
== %home/something.txt

>> text-dir: "home/hostilefork"

>> interpolate %(text-dir)/something.txt
** Error: FILE! interpolation slash calculus mismatch: home/hostilefork

>> text-dir: "home/hostilefork/"

>> interpolate %(text-dir)/something.txt
== home/hostilefork/something.txt

Default TAG! Scanning Needs "Supertag"

Historical Rebol/Red tag doesn't enter a paired scan mode when it sees parentheses and such:

red>> tag: <foo=(1 > 2) bar=(3 < 4)>
*** Syntax Error: (line 1) missing ( at ) bar=(3 < 4)>

It thinks the first tag is <foo=(1 > and gets confused after that.

When I first was giving feedback on the design of Rebol/Red I proposed something I called SuperTag!, which people didn't seem to like but only Ladislav remarked on it, without defending the objection:

SuperTAG!: upon a " { ( [ < in tag content, validate substring via Rebol lexer · Issue #2234 · metaeducation/rebol-issues · GitHub

Given my proposal that PRINT be cued by TAG! to do interpolation, I think we should indeed make SuperTag the default:

>> print <Supertag should be the (if 2 > 1 [reverse "tluafed"])!>
Supertag should be the default!

But only if you are using the basic no dashes delimiter form of tag. That means you can still make weird tags, you just have to use -<...>-.

>> <)))>
** Error: Unexpected ) in TAG! scanning (use -<...>- if intentional)

>> -<)))>-
== -<)))>-

This gives me pause to wonder if we should also make SuperString the default (?)

>> "}}}"
** Error: Unexpected } in TEXT! scanning (use -"..."- if intentional)

You'd then be able to use strings inside escapes:

>> interpolate "You could do cool (reverse "ffuts")!"
You could do cool stuff!

It would make literals for delimiters longer than we'd like. You already can't use things like #[ as a character constant, but have to say #"[", so it's usually preferred to say "[", but now you'd have to say -"["- which is 5 characters to get 1.

But I've been thinking maybe we should let #[ and #} be character literals, and sacrifice the #[...] "construction syntax" and the #{...} for BINARY!. It's very speculative and requires a bit of a perception change, but it may be a better idea than it seems on the surface.

Crazy Fun Times

This is introducing some constraints on the types. But without constraints and just saying "oh, they're just arbitrary strings" then you can't get that much leverage.

philosophy : "Freedom To" and "Freedom From" in Software Architecture

If you want an arbitrary string, use TEXT!.

As with the -<...>- tag form, there can be workarounds for the "putting anything you want" syntaxes.

BlackATTR · March 26, 2025, 12:21am

Wowzers, cool stuff. This is immensely useful for the kind of scripting I do -- file munging/wrangling.

hostilefork · March 26, 2025, 12:42am

Functional interpolation gets closer to being able to compete with things like bash.

I think Ren-C is starting to look like a contender for this domain, in a way that it wasn't before.

A variant of INTERPOLATE that treats $VAR as "lookup in environment" would be useful... something that could substitute the evaluator's dispatch in that case.