(1 + 2 * 3) is 9 and not 7... But Why?

rebol2>> 1 + 2 * 3
== 9

If we are going to measure complexity vs simplicity, 9 is a more complex answer than 7.

The reason it is more complex is because it means there are distinct evaluation modes.

  • In "NOT running an infix function" mode, an evaluator step is willing to do lookahead. That's the mode 1 is evaluated in above... it isn't running infix yet, so it's willing to "see the +"

  • In "running an infix function" mode, it foregoes looking ahead. That's the mode 2 is evaluated in above, so it does NOT look ahead to "see the *"

This introduces a flag, or some "awareness" into EVAL:STEP (the internal version used to fulfill arguments) to request running an evaluation in the mode of "not looking ahead".

Good And Bad: Alternative Semantics

There's a good and bad of this, in that it means you get different things from the infix vs. the prefix forms:

rebol2>> 1 + 2 * 3
== 9

rebol2>> add 1 2 * 3
== 7

rebol2>> 1 + multiply 2 3
== 7

It's good if you like variety. It's not good if you think working the same under substitution is desirable.

There are cases in Ren-C where this has to be overruled, e.g.

10 = length of block

Although OF is infix, we don't want that interpreted as:

(10 = length) of block

The reason the current implementation doesn't get bit by this is because literal-left arguments are done at a different point in the evaluation. LENGTH is a WORD! taken literally and not evaluated. So the "lookahead suppression" flag isn't heeded in the code that does literal lookback. But while this has worked around the rule for this case, it makes the rule seem even more suspect.

If we always saw 10 = ... as meaning ... is evaluated as if there were nothing on its left, that is a simpler model--both mentally and in the evaluator--than having to worry about exceptions.

Theoretical Benefit: It Makes Stacks Shallower

The idea that it "folds" values on the left means that you have a shallower stack if you write something like:

1 + 2 + 3 + 4 + 5 + 6

This does 1 + 2 and gets 3, then does 3 + 3 and gets 6, then etc.

Whereas if you just allow it to recurse, then 1 + will stay on the stack waiting while 2 + is evaluated, and that will run 3 + etc.

But there's more than one way to deal with such things. I've proposed things like allowing + to switch modes when nothing is on the left, so you can write:

(+ 1 2 3 4 5 6)

In practice, I don't think long strings of infix gaining efficiency through shallower stacks is a good argument.

Ren-C's Infix Makes The Exception More Complicated

The idea of an "INFIX" function is simply any function that gets its first argument from the left. It doesn't have to take just one argument on the right. It can take any number (including zero, so being effectively postfix).

Hence if you have an infix-three function that is an arity-3 infix, what should it do if it sees:

1 infix-three 2 + 3 4 * 5

How should that be interpreted? Does the "don't look ahead" kick in immediately while gathering the first argument, causing an error?

(1 infix-three 2) + 3 4 * 5  ; too few arguments

Or should the "don't look ahead rule" only apply to the last argument?

(1 infix-three (2 + 3) 4) * 5

Or does having more than one argument mean the rule isn't applied at all?

(1 infix-three (2 + 3) (4 * 5))

There's More Challenges With Reevaluation

I ran into an assertion failure involving the "don't look ahead" flag with a demo of an INLINER:

plus-two: infix inliner [left] [spread compose [(left) + 2]]
1 plus-two = 3

So the concept of this inliner is to react as infix, and rewrite the code so you get:

1 + 2 = 3

But rewriting after you've already dispatched an infix function leads to questions about "what mode are you in now". Are you in the looking ahead mode, or the non-looking ahead mode?

The answer to make this work is clearly "it should be in looking ahead mode", but the assert points to the questions about "why are we doing this".

And we have to write 3 = add 1 2 and not add 1 2 = 3. So what is the great value in being able to say 1 + 2 = 3 instead of 3 = 1 + 2

Is It Worth It?

I feel like seeing examples like length of really does show where "being less irregular" helps the language.

Cognitively, I think if we could say that add 1 2 * 3 and 1 + 2 * 3 behaved identically, there is value in that.

Does anyone want to speak up in favor of "irregular infix"?

This is one of the strongest arguments against the "no lookahead" flag.

It's an internal state we have to manage... and if you were to try and simulate a function call yourself, you'd be hard-pressed to do so unless we expose it as a parameter to EVAL:STEP.

If the whole goal of Rebol's infix model is the terra-firma of evaluator regularity vs. meeting some user concept of precedence, why doesn't it go all the way and be regular across infix and non-infix?


UPDATE: I Deleted The Lookahead Suppression Flag And The System Booted

That's a sign that no major feature depended on it (which I already suspected).

It is a question of what rule to follow. Most languages follow the standard precedence of multiplication over addition. Rebol decided to follow order of appearance. That is fine as well. For a lot of people that is a problem, because they are tought the first way and are not flexible. Rebol is flexible so maybe not the best choice for those people.

Back to why the order mult before add. Main reason is to write things like 2a + 3b where this almost obviously should not mean (2 * a + 3) * b.
In programming you need to write the multiplications with the * signs anyway and perhaps with some parens it is easy to clarify to the next person that needs to read your code what exactly it is supposed to do.

Obeying operator precedence in a typical way is not something that's on the table.

(We've discussed doing that as a MATH dialect, though it has proven difficult to process "just the operators" unless function calls are in parentheses, because you have to predict the length of a function call...which may be feasible now with PURE functions--and maybe if you could use pure functions without parentheses but need to parenthesize other expressions that's a good compromise?)

I'm merely talking about the exception which makes:

a op1 b op2 c

Behave differently from:

prefix-op1 a b op2 c

a op1 prefix-op2 b c

prefix-op1 a prefix-op2 b c

...regardless of what those operators are.

I think the same reflex which drove the original simplification drives the desire for the simplification which says those should all be the same.

It reduces the complexity of the evaluator, and I think when people get over "what they're used to" they'll realize it's simpler for users as well.

With richer and more powerful infix mechanics to use in the system, the regularity is more important so that those features combine better. It's not the same climate as when the decision to "fold infix ops before proceeding" was originally made (and I think the decision could have been reasonably questioned even then).