Joe Marshall on Rebol 1.0 vs. Rebol 2.0 Binding

Circa 2008, this is what Joe Marshall wrote about Rebol Binding in a comment on a blog called "Arcane Sentiment".

The binding model I used for Rebol 1.0 was basically a standard implementation of lexical binding. A lexical environment was carried around by the interpreter and when values were needed they were looked up. The only twist was that Carl wanted symbol instances to have lexical scope. So suppose you had this rebol function:

func [a] [ return 'a ]

The symbol that is returned can be evaluated in any context and it will return the value that was passed in to func. The trick was to close over the symbol. I had to tweak some of the other symbol routines to deal with this idea of closed over symbols, but it worked. (I don't think it was a good idea, but it was easy to implement.)

He wanted it to be the case that if you forced the evaluation of a symbol or block that you'd get the lexically scoped value. So you could return a block as a list of tokens and then call `do' on it and run it as if it were a delayed value. That's an interesting idea, but extending it to symbols themselves is probably going too far.

Carl never seemed to `get' how lexical environments work. The shape of the environment is constant from invocation to invocation, and this is how you get lexical addressing, but the instance of the environment changes.

When Carl ditched my code he went back to his original plan of allocating a binding cell for each lexical mention of the symbol in the source text. Imagine your source is like a Christmas calendar with the chocolate goodies behind each date. Behind each binding location is a box where the value is stored. This model has bizarre semantics. The first (and most obvious) problem is that it isn't re-entrant. If you recursively call a function, you'll end up smashing the values that are there for the outer invocation. The first release of Rebol 2.0 had this bug.

Carl patched this up with the following hack. If a cell is unbound, then binding it causes it to be assigned a value. If the cell already has a value, that value is saved away on the stack while the function is called, and then restored to the cell when the function returns. That fixes the re-entrancy problem, but you still have other issues.

The mechanism of saving the old binding away on the stack is plain-old bog-simple shallow dynamic scoping. But because Carl has a value cell for each binding site, rather than a single global cell, you only see the dynamic binding effect within a single function at a time. This makes it less likely that you'll suffer from inadvertant variable capture, but it doesn't eliminate it completely.

Carl's hack of leaving the previous binding in place if the cell was unbound before gives you a strange effect that variables can be used for some time after they are bound. Unfortunately, if a call to the routine that bound them occurs, the value gets smashed. This is `indefinite extent' in the truest sense --- you simply can't tell how long the variable will retain its value.

So Carl's binding methodology is a weird cross behind static binding, where each formal parameter has its own value cell, and shallow dynamic binding where the value is saved on the stack when a function is re-entered.

Carl's implementation has one other weird feature. If you copy a block of code, you also end up copying the value cells that are associated with the bindings in the code. Some rather enterprising Rebolers have used this trick to implement a `poor-man's lexical binding' by unsharing the dynamically bound value cells before leaving a function and thus getting the value to persist.

I've given my heavy criticism of R3-Alpha's binding model, and it's hard to imagine that it used to be even worse. But it sounds like it was.

However...

...I don't think it's necessarily bad to have the ability to drill down and bind at the WORD! granularity, if that's what you mean.

It's just that it's a bad place to start!

Historically, Rebol/Red bind deeply at LOAD-time... resulting in "stray" bindings on things that are not meaningful.

e.g. a WORD! which is destined to be a local in a function body could start out with a binding to a global.

This puts FUNC "in a bind" (no pun intended) when it processes the body: does it blindly overwrite the binding on the WORD! to become a local, or does it leave it pointing to the global?

You need a way to say "I meant this, don't override it". And you need a way to say "I don't want to bind this"... because it may be that the things you'll be binding to don't exist yet.

Many of the most interesting things you might want to do cannot be feasibly accomplished if you cannot punt and say "let the receiving context decide"...with the ability to point out the things where you say "I know the binding for this part."

People quickly find that they cannot do much in authoring solutions based on composing code with the broken model. Those who tried anything complex knew this; which is why people like Joe (and I) were very critical of the old way.

But Ren-C has brought order to the chaos, with a model that isn't based on smattering "meaningless" bindings around that need to be overridden with the "real" bindings.

Instead, everything starts out unbound and then the binding spreads purposefully through each step.

A post was split to a new topic: What if Blocks Had Scopes (But Not Individual Words)?

Joe Marshall’s critique is famous in the Rebol community because it highlights the fundamental "impedance mismatch" between Rebol’s "word-centric" model and the "environment-centric" model used by almost every other functional language (Lisp, Scheme, etc.).

His "Christmas Calendar" analogy—where a value is tucked behind a fixed physical location in the source code—perfectly captures why historical Rebol felt "wacky and broken" to Computer Science people.

The dilemma you describe is precisely what makes historical Rebol feel like "magical" but "unreliable" technology. When a system relies on side-effecting mutation to give meaning to code, you lose the ability to reason about that code independently of its history.

Here are a few observations on why the "Punt until the Receiving Context" approach is the breakthrough needed for competitive code composition:

1. The Death of the "Search-and-Replace" Evaluator

In the historical model, FUNC had to be an "active scavenger." It would crawl the body, looking for words that matched its parameters and forcibly bind them.

  • The Conflict: As you noted, if a word was already bound to something meaningful (like a global or a word from a surrounding object), FUNC faced a choice: clobber it (breaking the user's intent) or skip it (potentially failing to bind a local).

  • The Ren-C Solution: Ren-C moves from Search-and-Replace to Lookup-on-Demand. The body remains a "pure" block of unbound words. The function call creates a "Virtual Environment" (a Specifier). As the evaluator steps through the code, it looks up each word in that specifier chain.

2. Compositionality: Building "Lego" Code

In the old model, if you composed a block from three different sources, those sources might have different "stray" bindings. When you finally passed that aggregate block to a function, the function’s bind operation would behave like a "blunt instrument," potentially overwriting bindings that were carefully set in one of the sub-blocks.

  • Observation: You couldn't truly "pre-bind" things safely because the next function in the chain might "re-bind" them.

  • The Ren-C Advantage: By using $(...) (explicit binding) and '(...) (unbound), you can mix "pre-resolved" symbols with "delayed-resolution" symbols in the same block. The system respects the explicit "pinned" bindings while allowing the "pure" words to pick up the context of whatever evaluator is currently running them.

3. Dialecting Without "Leakage"

This is where the "competitive solutions" come in. In historical Rebol, writing a complex dialect (like a GUI or a Database wrapper) often involved "binding wars." You’d struggle to keep the user's variables available inside the dialect's keywords.

  • The "Broken" Model: You had to bind the user's code to the dialect's words, but that often "masked" the user's own variables (shadowing).

  • The Ren-C "Virtual" Model: Because binding is a "view" (a specifier) rather than a "mutation," a dialect can apply a layer of keywords over the user's code without actually changing the code itself. The user's code remains "clean," and the dialect functions like a transparent overlay.

4. The "Not Yet Created" Problem

Your point about binding to things that don't exist yet is the most profound for metaprogramming.

"If I am generating code for a remote server, or for a function that will be generated inside a loop, I cannot bind it at LOAD time because the target context hasn't been allocated in memory yet."

  • The Ren-C model treats code as template-ready. It’s like "late-binding" in OOP, but applied to the very atoms of the language (Words).

Summary of the Paradigm Shift

Feature Historical Rebol (R2/Red) Ren-C (Pure & Virtual)
Binding Time Deeply at LOAD/TRANSCODE Synthesized during EVAL
Data Nature Mutable (Words are changed) Pure (Words are constant)
Recursion Shallow dynamic hacks True Virtual Environments
Metaprogramming Requires constant bind calls Unbound by default; "just works"

Your Reboot: Making code unbound by default is the "clean slate" the language always needed.

2 Likes

:star: :star: :star:

I've pointed out in my examples that indeed, everything starts unbound... with "trickling down" context giving the meaning:

>> doubler: func [x] [
      let code: copy [add x]
      append code to word! "x"
      print ["Doubled:" eval code]
      return none
  ]

>> doubler 10
Doubled: 20

That function body is fully unbound (and never becomes bound!) Since nothing tricky is going on with code composition, it works like any lexically scoped code would.

But then...

I show off this "interesting idea" in "Ren-C Binding in a Nutshell", where compositional code can take advantage of this.

>> x: 1000

>> doubler: func [x] compose [
      let code: copy ($[add x])
      append code to word! "x"
      print ["Doubled:" eval code]
      return none
  ]

>> doubler 10
Doubled: 2000

I don't think it is. You could always put a symbol in a list like (x) instead of x. But that costs more... and what if you need it to be a WORD! for the dialect to recognize it as the right "part of speech" for the functionality you want?

Hence you can (still) go down to the level of a single symbol:

>> x: 1000

>> add1000: func [x] compose:deep [
      let code: copy [add ($x)]
      append code to word! "x"
      print ["Added a Thousand:" eval code]
      return none
  ]

>> add1000 20
Added a Thousand: 1020

While this aspect of what Carl wanted is "unusual", it's important to have this granularity of control if you want to have a symbolic structure for the code that is hybridizing meanings from multiple sources.

(It just needed "lexical environments" as the baseline, with this finer-grained tweaking as a composition tool for when you need it.)

1 Like