Pure Functional Programming in Ren-C

The elevator pitch for pure functional features is that a "PURE" function can:

  • Stop you from having side-effects your caller didn't intend

  • Keep you from depending on weird external state that can fluctuate in ways not related to the arguments you passed

Here's a pretty obvious "impure" function:

square-of: func [x [integer!]] [
    if now:weekday > 5 [  ; 6 = Saturday, 7 = Sunday
       print "I don't do math on weekends!"
       return 0
    ]
    return x * x
]

Consulting the clock and printing output are generally poor properties of a SQUARE-OF function. So it would be nice if we could mark SQUARE-OF as "PURE", and catch these calls.

It would be trivial to mark the functions NOW and PRINT as "IMPURE", and say that it is not legal to run an IMPURE function while a "PURE" function is running.

Any historical Redbol could have done that.

  • Just make a global variable pure_count of how many "PURE" functions you have entered.

  • Increment the count each time you call a "PURE" function, decrement it each time you exit one.

  • If pure_count is not zero, don't let you call an "IMPURE" function.

Simple, right?

...But Simple Checks Are Easily Subverted...

What if instead of directly calling an "IMPURE" function, you consult some global state?

weekday: 1

square-of: pure func [
    x [integer!]
][
    if weekday > 5 [  ; 6 = Saturday, 7 = Sunday
       print "I don't do math on weekends!"
       return 0
    ]
    return x * x
]

>> square-of 2
== 4

>> weekday: 6

>> square-of 2
I don't do math on weekends!
== 0

That throws a wet blanket on the idea.

Since functions are stored in variables, even a call to a PURE function would be an impure activity... so long as that function could change to another value--even a PURE one!

Immutability Is The Cornerstone Of Pure FP

In pure functional programming, you assign a variable a value and that is it.

So Rebol's first step toward pure functional programming has to be the ability to assign a value to a variable and say that's it for all time.

weekday: final 1  ; FINAL not the same as CONST [1]

/square-of: pure func [  ; leading / is syntax shorthand for FINAL [2]
    x [integer!]
][
    if weekday > 5 [  ; 6 = Saturday, 7 = Sunday
       print "I don't do math on weekends!"
       return 0
    ]
    return x * x
]

>> square-of 2
== 4

>> weekday: 6
** PANIC: Cannot change the value of final variable WEEKDAY
  1. Note that FINAL is distinct from CONST. CONST doesn't imply a variable can't change, it implies the value in the variable can't change.

    >> block: const [a b c]
    
    >> append block 'd
    ** PANIC: ...
    
    >> block: [d e f]  ; BLOCK was not PURE, overwriting is fine
    
    >> append block 'g
    == [d e f g]
    
  2. We need to make sure any functions we are going to call from a PURE context are FINAL. Rather than having to write foo: final pure func [...] this uses /foo: pure func [...] as a shorthand.

But Interesting Work Still Needs Mutation

If you have to consider every read and write of a variable to be an "impure" activity, then there's not a whole lot that a PURE function can actually do.

Getting nearly anything done in Rebol requires mutation (and observations of mutable state).

For example: this LENGTH-OF variant is effectively pure to its callers, but reads and writes local state:

length-of-slow: pure func [
    series [any-series?]
][
    let len: 0
    for-each 'item series [len: len + 1]
    return len
]
  • LEN is accruing a count--through mutation. We need to be able to read and write that state.

  • ITEM has to be adjusted as we go through the FOR-EACH

This gets to the crux of a problem: a PURE function can only read and write non-FINAL variables it knows nobody else can see. And it has to know not only that the state belongs to some PURE function, it has to know that state belongs to "it" (for some definition of "it").

But that would require some kind of... context information... that got threaded around, which told you where you were looking variables up.

If only we had that...

:thinking:

Waitaminute! Remember Ren-C Binding In A Nutshell ?

As it turns out, we do have this.

Consider this:

foo: pure does [
   let x: 10  ; this assign to X is ok (but how do we know it's okay?)
   let bar: pure does [x: 20]  ; assign to X not okay (how do we know?)
   bar
]

What we know when we hit x: 20 is that we're in "pure mode" (non-zero number of functions on the call stack). And we are looking up X in a binding which has BAR's frame in the call stack, which is pointing to a LET X, which is in turn pointing to FOO's frame.

We can basically say that we climb the chain until we hit our first PURE function frame. Any non-FINAL variables we see until then are ok. to read and write.

Are All Functions Either PURE or IMPURE?

No.

The default is neither. A function is agnostic until it references a non-FINAL variable outside of itself, or or calls an explicitly IMPURE function.

Consider something like IF. We don't want pure-if and impure-if and you have to use a different version depending on which kind of function you are in.

If you know a function will eventually turn out to be impure, it's good to label it so--so it can be seen on the interface, and you get your PANIC up-front. But the agnosticism is a deliberate feature to allow you to write abstractions that can be shared in pure and impure calling contexts.

I mentioned above that /foo: func [...] is a shorthand for foo: final func [...], and if you use this convention you're getting "agnostic" functions automatically. PURE functions will speculatively call a function so long as the function reference itself isn't in a variable that can change... and only panic once an impurity is called or a non-final variable that's not "local to the pure confinement" is read.

And if some library didn't happen to declare a function with /foo: or foo: final, you can gloss that by calling with /foo.

All that work enabling slashy functions wasn't for naught! It's a lot more tolerable when there's actual meaning behind the slash. People who don't care about the FP features can just not use it, ever (or the few times they use it, just use it to emphasize "this is a function", as was the prior purpose...)

So Are Functions Like APPEND PURE?

No, but they're not "impure" either. If they were pure then all their input arguments would be forced to be CONST.

Since they're not impure, they can be called from PURE functions.

So being in "pure mode" doesn't mean no mutating functions can't be called, but things that are "purely parameteric" can be called...APPEND does mutate, but only things you pass it directly.

So where the purity comes in is helping us understand what bits of state outside of the confinement are okay to examine.

CELL_FLAG_FINAL drops to CELL_FLAG_CONST

In the strategy I'm conceiving... when you read a FINAL variable, the value you get back is not itself FINAL (though it has likely been made immutable through deep locking...maybe a lighter guarantee of purity would just make it CONST, e.g. a "SEMIPURE" could do that.)

This means if you have to make some state FINAL in service of pure functions, you don't wind up in a situation where fetching those values and then returning them leak out values that cause your callee to lock up their variables by assigning them.

If you fetch a variable and want to store it final somewhere, you have to say FINAL again.

I Have Experiments On This But There's Work To Do...

What I'm seeing is promising. But this is far from trivial.

Updates as I progress...

1 Like

Purity Hits A Snag With Left-Looking :eyes: Literal Infix

Infix is an opportunistic thing. If you have (foo bar ...) it doesn't run FOO immediately, it first consults BAR to see "do you hold an infix operation". And if it does, then BAR may even be able to take FOO literally and run before it.

This capability is central e.g. to x -> [x + 1] being able to make a function... the arrow grabs the left thing.

But this means values are constantly saying "oh, no, I'm not infix... don't run me, you're not actually interested in me...I'm merely a WORD!"

That presents a challenge for a PURE function, when some stray arbitrary word is getting a vote... even if that vote is ultimately to not participate.

/slow-type-of: pure func [value] [
    value: type of value
    return value
]

What happens here is that infix-wise, the VALUE: set-word is "offered" to TYPE, asking it "do you want this as your literal left-hand-side?

But TYPE is A Random WORD!... it's not FINAL (like OF and RETURN are) and it's not confined to the pure function's scope (like VALUE is).

So giving it a vote in the evaluation--even if it votes no (by being undefined, etc.)--breaks the purity.

You can stop TYPE from voting by putting the right hand side of the assignment in a GROUP!

/slow-type-of: pure func [value] [
    value: (type of value)
    return value
]

Although that works, the reason why it works points out another problem.

It works because OF is given first dibs on saying if it wants to leftwardly consume TYPE or not. Since it says it does, TYPE is never consulted.

But what if you were taking something literally in the forward direction?

/give-me-x: pure does [
    literally x
]

Now it's the other problem. You offered LITERALLY to X.

Not Much Can Be Done About Right-Literal WORD!s

So long as there exists a concept of "we consult the right to see if it wants to take the left side literally", there is really no way to take a WORD! as a literal parameter in a PURE function...

...unless that word exists under a definition that the PURE function is allowed to consult (and say "I'm not a left-looking literal infix function")

So you can make dummy definitions :frowning:

/give-me-x: pure does [
    let x
    literally x
]

But that's about it. While other argument types will work fine literally (GROUP!, etc.) anything that can dispatch a literal infix function is just not legal to work literally.

This actually raises a good point about forward-leaning literal WORD!s: you should always have a way to escape it. That was already the case, otherwise you couldn't say:

for-each [->] [1 2 3] [probe ->]

So on that note, this rules out for-each x [1 2 3] in PURE functions, so long as left-literal infix exists (minus the dummy declaration trick). But pure functions probably couldn't use the binding in most cases anyway, so they'd use for-each 'x [1 2 3] most of the time.

This Makes A Good Case For Generic Escaping

We simply don't let you take the @ operator literally on the left.

/give-me-arrow: pure does [
    @ ->
]

There's already some tricks in the API that rely on this behavior. If we do this, then you've got a working escape mechanism for literal arguments, just in general.

for-each @ -> [1 2 3] [probe ->]

For Leftward-Literals... First Some Workarounds

First I'll point out that for leftward cases, OF is just a syntax convenience for TYPE-OF and you can use that:

/slow-type-of: pure func [value] [
    value: type-of value
    return value
]

Today's OF lets you put the property itself in a GROUP!, so there is an escaping mechanism:

/slow-type-of: pure func [value] [
    value: ('type) of value
    return value
]

And motivated by situations like this, it could accept quoted words as well:

/slow-type-of: pure func [value] [
    value: 'type of value
    return value
]

But Leftward-Literals May Have A Trick

So long as we've decided that literal x is just toast in pure functions and say "you have to escape it with @", we can take advantage of that.

The idea is: restrict leftward literalism to things that never had a chance to vote on their own evaluation.

e.g. when the evaluator sees something like value: type of value, you offer TYPE to OF first. So it acts like value: (type of value)

This kind of change is advantageous to performance as well... the fewer of these left-looks you do, the faster things will run.

Another consideration is what to do about SHADOWING, e.g. what happens when something in the binding chain gets added that wasn't there before.

/slow-negate: pure lambda [x] [
    negate x
]

y: slow-negate 1020

set (extend current-module 'negate) final lambda [x] [
    x + 304
]

z: slow-negate 1020

This SLOW-NEGATE is defined in some script/module, and that module has definitions that are searched before falling through to LIB.

It's clear enough that if you make a NEGATE that is in the module before the SLOW-NEGATE is defined, then if it's PURE that's fine... and if it's not that'd cause an error.

It takes a bit of circuitous code to create the effect at the moment (such that I don't really have a set-in-stone way to ask for the current module). But I don't feel like modules should be locked down for how many definitions they should have in the general case, even if you're using PURE functions inside them.

For it to be a problem you'd need the perfect storm of:

  • extending a module with a definition that a PURE function is already using

  • having that extended definition itself be FINAL (if it weren't the PURE function would notice and get an error)

  • have that extended definition itself be pure (at least emergently so in its instantiation)

Prohibiting you from EXTENDing with FINAL definitions could stop it from being a problem. :thinking:

This may be beyond the scope of what the "purity" is designed to protect against. You already have functions that can remove the FINAL if it's inconvenient and change the definition. EXTEND on a module to shadow an existing declaration fits in that same category of "reaching outside the normal flow of things to where we can't help you"

In any case, this certainly isn't a death-blow to the idea. Just something to bear in mind.