Pure and Refined: Simplifying Refinements to One or Zero Args

I'm always frustrated trying to name refinement arguments:

rebol2>> negate-maybe-add: func [value [integer!] /add add-arg [integer!]] [
             if add [return (negate value) + add-arg]
             return negate value
         ]   

If the function takes an /ADD why can't the variable be called ADD, and just be null when the refinement is not used? What's this other ADD-ARG name for? Isn't that what NULL is for in the first place? To indicate the absence of a value?

Functions with refinements have historically been pretty confusing, and having a refinement that takes more than one argument is extremely rare. If you really need multiple arguments to a refinement for some reason, there's blocks and paths and such.

Having the refinement argument's value itself be a refinement has been an interesting experiment:

 >> foo: func [/bar] [probe bar]

 >> foo/bar
 /bar

And it has been useful for refinements that don't have any arguments, because then the arguments themselves can't serve the "present or not" status.

But it's not like it would be that hard to write something like this:

>> used: func ['refinement [path!]] [
    if not null? get refinement [
        refinement
    ]  
]

>> foo: func [/a /b] [print [used /a used /b]]

>> foo/a
/a

>> foo/a/b
/a /b

(I think @IngoHohmann made something of the sort a while ago.) In any case, my point is that I think we can live without a separate "status of the refinement" and value.

How it would look in practice

Imagine this function interpreted under the new understandings:

foo: function [
    arg1 [block!]
    /ref1
    arg2 [string!]
    /ref2 [integer!]
][...]

What this would actually be saying is that you have a /ref1 refinement whose only value is its use or disuse. This would be like any refinement without an argument today. It would be blank if not used, and for good measure we could make it hold /ref1 as its value if used (seems better than making something else up, and actually has applications for 0-arg refinements.

But then, arg2 is just another normal argument that comes after it. And ref2 is a refinement with an integer! argument--but that integer argument would arrive int the ref2 variable itself, or it would be a blank.

So what this function actually is doing would be like the following in today's world:

foo: function [
    arg1 [block!]
    arg2 [string!]
    /ref1
    /ref2
    ref2arg [integer!]
][
    ref2: ref2arg
    unset 'ref2arg
    ...
]

Already you can see that it wouldn't be that different from today. And Ren-C already has tricks up its sleeve for doing legacy emulations...the old behavior of getting multiple arguments would be emulated one way or another, without doing too much extra work. (The simplest emulation would allow the same notation for single-argument refinements, and error with more than one argument--and that is likely sufficient.)

It would save space and speed the system up

Right now when you have a refinement with an argument, that's two frame cells to fulfill. Collapsing it to one is obviously more efficient.

But saving on storage is only part of it. There's a lot of evaluator complexity trying to keep the state and worrying about there being more than one argument...looping, checking. A ton of complexity just vanishes with this.

The "refinement revocation" methods of today are more complex than they need to be as well. You can get in dicey situations where you've revoked one argument and not another. Specialization has to cover cases where you set a the refinement to false but the value to true. The fact that you can always make a parameter a block if you really want it to carry multiple arguments seems to solve a lot of problems.

You could put normal arguments after refinement arguments

I show in the example above putting an ordinary argument after a refinement argument. That may not look all that useful to you. Maybe it would have some help in putting related parameters together without worrying about whether they were optional or not...kind of letting you express things in the flow of your thought.

But there's a really compelling reason to do this mechanically for deriving functions that add new arguments:

Because of the way frames work positionally, you can't derive one function from another in a way that reorders its arguments. This means that if you try to derive from a function that has two normal arguments and one refinement, you can do it today because it's not implied that everything after that point is a refinement. But once you've entered the "refinement zone", it's a point of no return.

This would correct that weirdness and permit extending functions with more parameters, either regular or refinement, and not run the risk of a regular refinement getting picked up as an argument to something it didn't intend.

You could write your arguments in any order in APPLY

It helps make sense of the "The refinement names the argument you are about to give" situation. But why not let you put refinements anywhere in an APPLY?

>> block: [1 2 3]
>> apply :append [/dup 2 /value <x> block]
== [1 2 3 <x> <x>]

The current refinement processing mechanics would be much easier to rationalize and simplify under this model and likely make such reimaginations possible--as well as other forms of lightweight skinning that let you reorder function arguments on a whim.

I haven't tried writing it yet, but...

When I think of all the various parts of the system that get bent out of shape over edge cases, I have to say I think this sounds like it may well be a winner.

For a while we could disable the ability to put normal parameters after a refinement, and just raise an error if you do that. So you'd know to convert /foo bar [integer!] to just /foo [integer!] In the future though, cases like bar would start working as being a normal parameter.

The only casualties I can think of are using blanks as refinement arguments, and being able to do partial refinement specialization inside of an APPLIQUE. So you couldn't specialize like this:

  specialize :append [part: true ...]

That would assume you wanted PART to be the value true. For a partial specialization (e.g. one that says you get the behavior as if you'd written APPEND/PART at the callsite, getting a refinement as a normal arg) you'd have to say:

 specialize :append/part [...]

I can think of some other mechanical complications, but nothing overwhelming off the top of my head.

1 Like

My gut reaction, I don't like it. It is an important part of Rebol, though I hate to have to come up with a name for a variable for a refinement.

Now I've checked the core and only found 2 functions using it. For whatever that's worth.

Things to consider when using blocks:

  • documentation (how many values, which types)
  • type checking
  • what's the impact of creating unnecessary blocks?

Why worry about the impacts of something that never happens? Doesn't seem too relevant when you've "checked the core and only found 2 functions using it" and "hate to have to come up with a name for a variable for a refinement"...

Take my word for it from writing the evaluator and things like SPECIALIZE. If you want to talk about impacts on the system from a performance and memory standpoint, this is an huge benefit. The cost of the blocks you'll never make pale in comparison. Shortens function specs, saves space, simplifies broadly.

I'll see if while implementing it I find any gotchas, but I am anticipating this wiping out huge amounts of complexity, and losing nothing of real value...while opening the door to important features like the ability to extend functions with normal arguments. So if you had a function that took two normal arguments and a refinement, you can make an adaptation that takes another normal argument and it won't be picked up as an argument to the refinement at the end of the existing spec.

I am confused. Refinement parameters always come after "normal" parameters. If a function specifies two refinements (with or without refinement-arguments) and two arguments, the only reordering power you get at the call site is that the refinements can be expressed in either order. The two parameters must come, in the order their corresponding arguments are present in the spec block, before any refinement parameters, whatever order they must be in. Currently, if you wanted to add a new argument, you'd put it in the spec block before the first refinement -- which must be exactly as hard as inserting it before the <local>, which you'd have to do for your idea anyway.

In order to reuse an implementation, the frame you build must have the parameters in the integral order that underlying implementation expected by the time it gets to the point of running its code.

 foo: function [x <local> y z] [...]  ; spec transformed to [x:(normal) y:(local) z:(local)]

 foo: function [x y] [...]  ; spec transformed to [x:(normal) y:(normal)]

So <local> is collapsed to the idea of a property on the parameter itself, not "everything after some point". There's no "gear to shift"--the visitation of the local is the only moment in which it's in local mode, then it's on to the next one. So a derived function can add normal parameters after that without being confused.

Refinements, on the other hand, do effectively "shift gears" of the frame fulfillment. I'm talking about not having this gear-shift. The elimination of the shift is a very good and simplifying thing, saving memory cells, accelerating the system and removing a bunch of code. And nothing of significant value is lost.

We'll get type checking of patterns inside of BLOCK!s etc. one way or another. Right now I'm attacking the bitset-of-64-bits limit for types.

Though Ingo didn't mention it here, on the commit itself he said:

I have to take your word about this simplifying and speeding up the evaluator.
I can't currently comment on the code quality, but I trust you there.
I obviously haven't tried this, yet, but I seem to have come to like the idea, so go for it.

You still can name the refinement variables if you like. The times when it makes the most sense to do so is when your refinement has an active name, which due to the shortage of short non-noun names available in Rebol, you usually want to recover those from LIB. The previous model didn't do this...so you used up two names.

For instance, in these patches to @rgchris's curl.reb, the /AS refinement is changed to /AGENT. I don't know if the right decision would be to call the refinement /AGENT (probably is), but I wanted to show the issue of when you want to get AS the operator back how you have to do it today...which you would have had to do using such a refinement name. I think there should be something in the spec dialect which does this shuffle for you...which probably means capturing the meaning of AS however it was when the function was defined (e.g. COMPOSE-ing it into the body) vs. blindly getting it from lib.

That's all open to suggestions, but this change is in, and the Redbol layer emulates the old interface. Let me know of any questions/comments/concerns.

(Note: While I may make it "look easy" to do these kinds of changes to the system, they are -a lot- of work. But the impressive thing is the resilience of the asserts and debug mechanics that allow the changes. It's not the kind of thing R3-Alpha/Red would be prepared to do. Which indicates just how much more potential the Ren-C codebase has in it.)

I came across an old comment on Redo of an action. This concept of "redoing" an action originated from the PORT! code, where there'd be a generic definition of an interface function on the port...which would then have to be redispatched to a usermode function. So the arguments were gathered for the archetype, and then had to be moved into the right slots to line them up in the implementation...which could be at entirely different positions in the frame.

(The same code was later used in HIJACK when functions were hijacked with functions that were not derived from the same base, e.g. via ADAPT/SPECIALIZE/ENCLOSE. Thus they can have incompatible frames, so some guesswork needs to be done in a similar way.)

The comment I wrote says this, and take note of what I say about the difficulty of doing the mapping "in the face of targets that are 'adversarial' to the archetype:"

// This code takes a running call frame that has been built for one action
// and then tries to map its parameters to invoke another action.  The new
// action may have different orders and names of parameters.
//
// R3-Alpha had a rather brittle implementation, that had no error checking
// and repetition of logic in Eval_Core.  Ren-C more simply builds a PATH! of
// the target function and refinements, passing args with EVAL_FLAG_EVAL_ONLY.
//
// !!! This could be done more efficiently now by pushing the refinements to
// the stack and using an APPLY-like technique.
//
// !!! This still isn't perfect and needs reworking, as it won't stand up in
// the face of targets that are "adversarial" to the archetype:
//
//     foo: func [a /b c] [...]  =>  bar: func [/b d e] [...]
//                    foo/b 1 2  =>  bar/b 1 2

This shows another angle of how having refinements with arguments inhibits a simple mapping of one named argument to another.

Of course this raises the question of whether you should have to name your arguments the same thing. If the original function takes [event] should you have to name it the same, or could you say [e] if it was a plain argument and not a refinement?

Either way, it gets a lot easier to set a policy when refinements are the arguments.

I'm doing a bit of maintenance on the bootstrap executable, and I'm quite dizzy looking at the hoops SPECIALIZE tried to jump through to work with the original model.

We still face some questions about how to do reordering of parameters. But this stuff was crazy.

Here's some comments about the convoluted logic:

// A specialization is an ACTION! which has some of its parameters fixed.
// e.g. `ap10: specialize 'append [value: 5 + 5]` makes ap10 have all the same
// refinements available as APPEND, but otherwise just takes one series arg,
// as it will always be appending 10.
//
// The method used is to store a FRAME! in the specialization's Action Body.
// It contains non-null values for any arguments that have been specialized.
// Eval_Core_Throws() heeds these when walking parameters (see `L->special`),
// and processes slots with nulls in them normally.
//
// Code is shared between the SPECIALIZE native and specialization of a
// GET-PATH! via refinements, such as `adp: :append/dup/part`.  However,
// specifying a refinement without all its arguments is made complicated
// because ordering matters:
//
//     foo: func [/ref1 arg1 /ref2 arg2 /ref3 arg3] [...]
//
//     foo23: :foo/ref2/ref3
//     foo32: :foo/ref3/ref2
//
//     foo23 A B ;-- should give A to arg2 and B to arg3
//     foo32 A B ;-- should give B to arg2 and A to arg3
//
// Merely filling in the slots for the refinements specified with TRUE will
// not provide enough information for a call to be able to tell the difference
// between the intents.  Also, a call to `foo23/ref1 A B C` does not want to
// make arg1 A, because it should act like `foo/ref2/ref3/ref1 A B C`.
//
// The current trick for solving this efficiently involves exploiting the
// fact that refinements in exemplar frames are nominally only unspecialized
// (null), in use (LOGIC! true) or disabled (LOGIC! false).  So a REFINEMENT!
// is put in refinement slots that aren't fully specialized, to give a partial
// that should be pushed to the top of the list of refinements in use.
//
// Mechanically it's "simple", but may look a little counterintuitive.  These
// words are appearing in refinement slots that they don't have any real
// correspondence to.  It's just that they want to be able to pre-empt those
// refinements from fulfillment, while pushing to the in-use-refinements stack
// in reverse order given in the specialization.
//
// More concretely, the exemplar frame slots for `foo23: :foo/ref2/ref3` are:
//
// * REF1's slot would contain the REFINEMENT! ref3.  As Eval_Core_Throws()
//   traverses arguments it pushes ref3 as the current first-in-line to take
//   arguments at the callsite.  Yet REF1 has not been "specialized out", so
//   a call like `foo23/ref1` is legal...it's just that pushing ref3 from the
//   ref1 slot means ref1 defers gathering arguments at the callsite.
//
// * REF2's slot would contain the REFINEMENT! ref2.  This will push ref2 to
//   now be first in line in fulfillment.
//
// * REF3's slot would hold a null, having the typical appearance of not
//   being specialized.
//

If You Don't Understand That, Don't Worry About Trying

It's building a data structure inside the refinement slots that jumbles the order and has to be unpacked during traversal... it's not worth it.

Anyway, I was testing the bootstrap executable and hit a bug in this stuff and said "Oh, screw it, I'm ripping this all out."

Refinements just being their arguments... with ~null~ antiforms as the state of being not in use... is so much easier to implement, and ergonomic to use.

The Feature We Don't Have Today Is Refinement Promotion

What the old code could do (sort of) which we can't do today is transform a refinement into an ordinary argument.

apd: append:dup/
[a b c d e d e] = apd [a b c] [d e] 2

It's a valid desire, and things are in a better position to do it correctly and clearly. But the old implementation was no good, even if it showed some decent results.

So here are some tests that aren't going to work any more, ever, in the bootstrap executable:

foo: func [/A aa /B bb /C cc] [  ; Note: was before refinements were null
    return compose [
        (maybe any [A]) (maybe aa)  ; ANY makes blanks into nulls
        (maybe any [B]) (maybe bb) 
        (maybe any [C]) (maybe cc)
    ]
]

fooBC: :foo/B/C
fooCB: :foo/C/B
    
did all [ 
    [/B 10 /C 20] = fooBC 10 20
    [/A 30 /B 10 /C 20] = fooBC/A 10 20 30

    [/B 20 /C 10] = fooCB 10 20
    [/A 30 /B 20 /C 10] = fooCB/A 10 20 30

    error? sys/util/rescue [fooBC/B 1 2 3 4 5 6]
    error? sys/util/rescue [fooBC/C 1 2 3 4 5 6]
    error? sys/util/rescue [fooCB/B 1 2 3 4 5 6]
    error? sys/util/rescue [fooCB/C 1 2 3 4 5 6]
]

Here is another one:

apd: specialize 'append/part [dup: true]
apd3: specialize 'apd [count: 3]
ap2d: specialize 'apd [limit: 2]

xy: [<X> #Y]
abc: [A B C]
r: [<X> #Y A B A B A B]

did all [
    r = apd copy xy abc 2 3
    r = applique 'apd [series: copy xy  value: abc  limit: 2  count: 3]

    r = apd3 copy xy abc 2
    r = applique 'apd3 [series: copy xy  value: abc  limit: 2]

    r = ap2d copy xy abc 3
    r = applique 'ap2d [series: copy xy  value: abc  count: 3]
]

And another:

ap10d: specialize 'append/dup [value: 10]
f: make frame! :ap10d
f/series: copy [a b c]
did all [
    [a b c 10] = eval copy f
    f/count: 2
    [a b c 10 10 10] = eval f
]
1 Like