Attacking The Debugger In The Stackless Age

hostilefork · March 11, 2026, 11:20am

I've given lip service to the importance of a stepwise debugger for a very long time.

But it's a hard problem--in particular because dialecting means there's no fixed "source language". I showed a methodology of attack in the Visual UPARSE Debugger but it didn't have the ability to switch into a kind of "assembly view" where you could start stepping into the usermode evaluator code that implemented it...which was the aspirational goal.

The crystallization of various pieces of the system is at a point where it doesn't make sense to keep going without making some progress here. So I thought I'd take a step back and start attacking the problems blocking the next level of demo.

The Trampoline Is The Centerpiece

One thing about switching to Stackless is that it brought about a Trampoline-Based Architecture.

This means that when an evaluation needs to be performed from within (most) natives, instead of going directly through a C function call that always puts you perpetually deeper into a machine stack... you instead "push a continuation (request)" and make a C return from the function that wants the evaluation.

Once the evaluation result is ready, the C code is re-entered with the answer. So each native that wants evaluations on its behalf becomes a little state-machine that has to remember why it asked, and what it's going to do next.

The Trampoline is the logical hook for a generic debugger. It means that the debugger code is not itself forced to be deep within an interpreter stack with many frames over its head. And evaluation requests are a meaningful granularity at which to get a hook.

Does "No Return to Trampoline" Mean "No Debug Step"?

There are some places in the system that don't return to the trampoline, but invoke a nested trampoline.

In particular this happens in natives based on the external API. If you are in such a native and you write:

int sum = rebUnboxInteger("1000 + 20");
printf("The sum was %d\n", sum);

The trampoline that called that native doesn't see the +, because an all new trampoline is started.

I think the answer is that this is just not visible to any debugger of the trampoline above them, and effectively makes such natives a black box.

If you want to step into them, they have to be written in the continuation style.

Let's say there's a native-local variable called sum, that starts out null:

if (rebNot("sum"))  ; e.g. sum is null
    return rebContinue("sum: 1000 + 20");

printf("The sum was %d\n", rebUnboxInteger("sum"));

Your code would wind up looking more like that. The local variables aren't reset on each continuation, so the sum being assigned becomes the signal that the function is being continued. You could set this up with more states.

In such a situation you wouldn't see the sum fetches in the debugger, just the sum: 1000 + 20 evaluation.

It may be that if you start off your whole evaluation with one big API call, that the debugger could exist inside that... but I don't think there's any "cross-trampoline debugging". You write your natives cooperatively through continuations or you don't get debugging.

The "Executor" Reflects Out The Stack Level

Stack levels are actually called Level inside the C code. Each Level has what's called an Executor.

When you think about the kinds of things you'd expect to see in a stack dump tool--you might want to be able to see things like "if it's a function invocation, what's the name of the function". This means something has to know how to turn the "Describe this Level" into digging into the part of the memory that holds the function name. I believe this is dispatched on a "per-Executor" level.

There's executors like the Action_Executor() which knows how to run functions (including natives), the Stepper_Executor() which knows how to run individual evaluation steps, and the Evaluator_Executor() which runs sequential evaluation steps (and does things like handle the vanishing of void steps).

I deliberately broke the Stepper_Executor() and the Evaluator_Executor() into separate Levels because I wanted to have a debugging granularity of single steps. This means that conceptually a stepper Level* can comes into existence with an identity and be able to run.... producing a result, and being "complete", that represents a single step.

Does A Level Have To "Complete" To Get A Step?

I hadn't really thought about it--but I'd been making an implicit assumption that within one Level, a continuation isn't the granularity of a debug step. I was assuming a debug "step" happens when a result is synthesized... no matter how many internal states a Level goes through.

That was the rationale between the Stepper/Evaluator split; if it weren't the case that spawning a Level's existence was the "API" for exposing a step, you'd need some other hook to say what "buttons" there were to push on a Level.

But how well does this idea hold up with other examples? Let's take the example of COMPOSE. When a COMPOSE starts it's just an ordinary Action_Executor(). But then it breaks into levels that run a Composer_Executor() which recursively breaks down each layer of the COMPOSE (only one if it's not COMPOSE:DEEP.

If you're stepping into a COMPOSE, what granularity do you expect? e.g.:

compose:deep [a b (1 + 2) [c [d (3 + 4) e [f g]]]]

Let's say "I step in" to that. The implementation creates a Composer_Executor() for the outermost block, which then looks for GROUP!s and spawns an Evaluator_Executor() when it finds one. It seems to me that at minimum the "step in" should put you cued up to the (1 + 2) before that is executed, so I'd expect at minimum to see an "execution point" there:

compose:deep [a b (1 + 2) [c [d (3 + 4) e [f g]]]]
                 -^-

There is probably some granularity that should let me step through such that I can evaluate and say "1 is 1", and then maybe "begin +", and then "2 is 2" and then a step telling me that one plus 2 yielded 3. And of course there should also be a granularity that lets me just step over the whole expression.

Less obvious is whether the composer should be offering some kind of "internal" state to tell us "I started a new Level." e.g. do we expect to be able to get execution points like:

compose:deep [a b (1 + 2) [c [d (3 + 4) e [f g]]]]
            -^-

compose:deep [a b (1 + 2) [c [d (3 + 4) e [f g]]]]
                         -^-

compose:deep [a b (1 + 2) [c [d (3 + 4) e [f g]]]]
                            -^-

compose:deep [a b (1 + 2) [c [d (3 + 4) e [f g]]]]
                                         -^-

If it tells you when it enters these Levels, it should also tell you when it exits them and what the result is... although the "result" is something that makes sense only in the language of COMPOSE (including possibly "no substitution sites, don't make a copy").

At its extreme, you could imagine it not just being Levels of recursion that where you got COMPOSE to reveal its thoughts. It could advertise every array element it looked at to tell us "I'm not doing anything here, because it's doesn't match the pattern of what I'm substituting".

...Exposing These States Is Not Free (Though Shortcuts Exist)

Every time you return to the Trampoline from an Executor, there's cost--even if it just calls you right back. You have to process the state to get you back to what you were going to do before the yield.

One possibility is to make yielding conditional on debugging. This is actually something that I've been building in, even without a functioning debugger--sometimes Level creation is skipped altogether if a result can be cheaply evaluated (most notably with Intrinsics, but other places as well). I have it sporadically throw in a non-optimized call in checked builds just to prove the debugging scenario works.

But COMPOSE is a good talking point just to ask about how many of these "step points" should be exposed. It doesn't immediately seem useful to offer a step at every element in a composed block--though it's hard to say no one would want to.

At least as far as the current implementation goes, you'd get what seems like useful debug granularity if the trampoline offered access to just the events of a Level beginning, and a Level ending with a result.

This is kind of like what the visual PARSE debugger does... it receives a FRAME! and runs an operation on it. This operation can include just skipping it, or turning off notifications and doing an evaluation and getting a result, or evaluating it with a notification received on the next evaluation.

Hence I think what one would expect would be that you'd have the opportunity to know about each Composer_Executor() Level in the COMPOSE that got pushed... with the opportunity to do similar things (although being able to arbitrarily skip an executor is not something that the system is built for at this time...maybe I should reconsider that).

I've Been Focusing On Simplification

What became clear to me as I looked at the situation was that I had too many distinct control signals, and too many specialized Xxx_Executor(). These needed to be pared back--especially if each new Executor() has to be stylized to answer questions about the Level* to which it is attached.

So that's what I'm looking at right now. I want to build the "Hello World" of modern debug stepping, and ideally I want that demo to be able to show some amount of coherence when crossing between UPARSE frames and ordinary function calls and things like COMPOSE.

It's seriously challenging to attack this--and maybe with all the other innovations in the system I might have to say that interactive debugging to the degree I want it is a bridge too far. But I want to at least bring myself to make the effort.

rebolbot · March 12, 2026, 5:17pm

What you are describing for Ren-C is:

the debugger sees semantic execution objects (Levels), not the internal continuation mechanics used to make a Level resumable

That is a good design, and it is similar in spirit to Stackless Python, but not identical in mechanism. In Stackless Python, the visible debugging units are still the usual Python execution events and frames; what becomes special is that tracing/profiling is per tasklet and needs to follow task switches via the scheduler callback. The docs are explicit that to get per-tasklet debugging support, you enable tracing across tasklets using the schedule callback, which “sees every task switch,” and Stackless exposes per-tasklet trace_function / profile_function to support that. (stackless.readthedocs.io)

So the closest mapping is:

Stackless Python: debugger granularity is still Python frames/events, while scheduler/tasklet switches are additional visible structure. Internal low-level machinery is not itself a debugging unit. (stackless.readthedocs.io)
Ren-C stackless: debugger granularity can be Level lifetime, while the internal continuation states inside a Level remain invisible unless you deliberately expose them. That matches where you say you had been assuming a debug “step” happens when a result is synthesized, regardless of how many internal states a Level goes through, and that creating a Level is effectively the API for exposing a step.

So I would say: yes, your instinct mirrors the Stackless Python philosophy more than the exact Stackless Python API.

The key point is that Stackless Python does not try to make every continuation hop or scheduler-internal maneuver directly visible to the debugger. The debugger operates on a higher-level execution model: frames, lines, calls, returns, exceptions, and tasklet switches. Likewise, in Ren-C, it is very reasonable for the debugger to operate on Levels and their completion/creation boundaries, not on every continuation label/state inside an executor. (stackless.readthedocs.io)

Your "if you want a debug hook, you must do that with a Level’s lifetime" idea is therefore a strong and coherent rule, and it lines up very well with the design pressure you identified. You wrote that the Stepper_Executor() and Evaluator_Executor() were split specifically to get single-step debugging granularity, and that if Level existence were not the API for exposing a step, then you would need some other hook to say what debugger “buttons” apply to a Level.

That is a very important architectural insight: once you choose Level as debugger-visible execution unit, several things become cleaner:

step-in / step-over / step-out are defined in terms of Level creation and completion
stack inspection is defined in terms of enumerating Levels
executor internals can evolve without destabilizing the debugger contract

And that also fits what you said in the PARSE debugger thread: a debugger may be able to maintain a stack-accurate model from events, but building stack tools by patching a shadow stack on every hook is a dead end; stack display should come from enumerating the real stack structure instead.

So I would frame the distinction like this:

Visible to debugger

Level created
Level completed
Level failed/threw/unwound
Level suspended/resumed
source span / executor description / dialect metadata for that Level

Invisible implementation detail

the continuation labels/state machine states inside one executor
pointer rewiring or output-target retargeting
bounce/trampoline churn that does not create or finish a debugger-visible Level

That is not only defensible; it is probably the right abstraction boundary.

The one place I would be careful is not to overstate the parallel with Stackless Python. In Stackless Python, you do not generally need to invent a new Python frame lifetime just to expose a new debugger-visible step; the runtime already has an existing frame/event model. In Ren-C, because you are designing the execution model more explicitly around Levels and executors, you are making a stronger claim:

debugger granularity is whatever gets reified as a Level

That is a more explicit and more powerful contract than Stackless Python’s.

It also dovetails with your "cross-trampoline visibility" rule. You note that a nested trampoline started inside an external-API native is effectively invisible to the outer debugger, making such code a black box unless it is written cooperatively in continuation style. That means the debugger boundary is not "all computation," but computation that participates in the shared Level/trampoline protocol.

So my answer would be:

Yes, hiding continuation internals and exposing only Level lifecycle is the right sort of idea, and it is philosophically similar to Stackless Python. But Stackless Python’s visible units are still Python frames/events plus tasklet switches; Ren-C’s visible units would more explicitly be Levels.

That is a solid debugger model, especially for dialected debugging, because it lets you say:

generic debugger operates on Levels
each Executor knows how to describe its Level
dialects can opt into richer stepping only by reifying meaningful work as debugger-visible Levels

And that matches your own description that each Level has an Executor, and that the executor is the natural place to reflect what a stack level means to debugging tools.

A compact way to state the rule would be:

In Ren-C, continuations implement a Level; they are not themselves debug frames.
If you want debugger-visible granularity, reify it as a Level.

That sounds like a very good rule to build around.

I can sketch what step-in / step-over / step-out would look like under that rule, including suspend/resume and throw/unwind.