CELL_FLAG_NEWLINE (_BEFORE) or (_AFTER)?

Historical Rebol uses what is effectively "CELL_FLAG_NEWLINE_BEFORE" on the Cells. So it can put a new-line marker on any Cell in the list, including the head.

But in a list with N cells, there are actually N + 1 positions where you might have a newline.

Historical Rebol doesn't have a way to say there is a newline at the tail. (Although the feature was requested by Ladislav, in 2013).

Circa 2018, Ren-C added ARRAY_FLAG_NEWLINE_AT_TAIL, so that the list itself can carry knowledge of whether there's a newline at the tail.

But What About _NEWLINE_AFTER + _NEWLINE_AT_HEAD ?

In rendering code, you see loops like:

const Element* tail; 
const Element* at = List_At(&tail, list);

for (; at != tail; ++at) {
    // render Element `at`
}

So... is there any advantage to one or the other "arbitrary" choice of which way to bias the flag?

Consider that one question you have to answer is "do I need to put a space or not after the current item". If you're going to be spitting out a newline character you don't want both a space and a newline.

This means that if you use CELL_FLAG_NEWLINE_BEFORE, then while you are processing at you have to examine at + 1 to see if it has the flag. If it does, no space.

...but to examine at + 1 you have to know at != tail. :frowning:

So if we shift to CELL_FLAG_NEWLINE_AFTER, is it possible to write the rendering loop without needing to consult the flag on at + 1?

Put A Pin In That... :pushpin: Does It Affect Userspace?

I was provoked by the implementation question.

But it brings to mind maybe a bigger one...about what you expect to happen with:

 >> block: [
       a b c
       d e f
    ]

>> change skip block 2 '[x y]

>> block
== ???

Which do you expect:

     a b [x y]
     d e f


     a b [x y] d e f

I'd personally expect the line structure to stay unchanged with that replacement.

Question: Does Choice of Bit Matter? (And should It?)

Whether the bit matters or not depends on whether you think:

  • (A) The bit belongs to the list
  • (B) The bit belongs to the value

Curiously, Rebol2 doesn't let you ask about the NEW-LINE? flag on an individual value cell...but only as a position in a block:

rebol>> help new-line?
USAGE:
    NEW-LINE? block

DESCRIPTION:
    Returns the state of the new-line marker within a block.
    NEW-LINE? is a native value.

ARGUMENTS:
    block -- Position in block to check marker (Type: block)

That might lead you to think that the before/after is just an implementation detail, since you can only ask the question of a list position.

However, Rebol2 leaks the implementation detail because the bit travels with a particular Cell:

rebol2>> block: [a b
    c d]
== [a b
    c d
]

rebol2>> map-each item block [item]
== [a b
    c d
]

rebol2>> block2: []

rebol2>> foreach item block [append block2 item]
== [a b
    c d
]

In these cases, it appears to "give you what you want". But it's a very shallow illusion.

In any case, the stickiness of the bit means it makes a concrete difference whether you pick CELL_FLAG_NEWLINE_BEFORE or CELL_FLAG_NEWLINE_AFTER in the implementation:

rebol2>> test: [a]

rebol2>> append test block/3
== [a
    c
]

If the flag were manifest as CELL_FLAG_NEWLINE_AFTER on b then that wouldn't have put c on a new line.

:thinking:

Can We Go With: (A) The bit belongs to the list ?

All things being equal, it would be ideal if we did not say the bit "belonged" to the Cell.

One nice side-effect of that is the bit can be reclaimed for meanings in Cells that aren't resident in lists. (Cell flags are scarce, and it's nice to have as many as possible not get copied along with the Cell by default.) And this means the flag that indicates newlines can safely be used on variables to mean something else.

Does the isotopic era give us tools to solve this problem? Random off-the-cuff thought: could splicing suggest not adding newlines to lists, but you get newlines otherwise?

 >> append [a b c] 'd
 == [a b c
     d]

 >> append [a b c] ~[d]~
 == [a b c d]

That seems like a pretty bad idea, especially if we are gearing up to a model where newlines aren't allowed in mid-operation.

Rejected Parallel Universe: Reified Newlines

We might imagine a world where you use backslash to say "there's no newline here", otherwise you get a reified newline in the block.

block: [ \
    a b c \
    d e f \
]

>> block 
== [a b c d e f]

Otherwise there would be reified newline markers in the block, such that first block is not a, but a newline-like-BLANK!.

That seems awful. :nauseated_face: While I like the idea of actual completely blank lines being reified in the block... that is actually useful, in the way that commas are useful.

But the newlines... no, that's too much.

I Think We Need Iterators That Show You NEW-LINE

If we're going to make the bit belong to the List and not the Cell, there needs to be a not-terribly-difficult-way-to "see" the newlines, otherwise it's too hard to heed newlines in your code.

One way difficulty manifests is in being hard to write your own newline-preserving COPY, but I'd argue that this is just the tip of the iceberg.

Let's say some kind of EACH-SPECIAL is able to iterate and give you those markers, perhaps as NULL or something of the sort:

block: [
   a b
   c d
]

for item (each-special block) [
    probe item
]

Imagine that giving you something like, uh:

\~[
   ]~\
a
b
\~[
   ]~\
c
d
\~[
   ]~\

That idea of an empty SPLICE! with a newline marker in it would allow you to copy things, because if you APPEND that to a block it gets the newline. (That needs a name... NONELINE?) :face_with_diagonal_mouth:

>> map item (each-special block) [
       item
   ]
== [
   a b
   c d
]

That's kind of stupid, probably, compared with giving you NULL... which you could test with IF, and then turn into a blank line if you wanted to, or triage other ways, easily:

>> map item (each-special block) [
       if not item [  ; "what do we want to do with newline markers?"
           continue noneline  ; just propagate them
       ]
       item
   ]
== [
   a b
   c d
]

That's Probably The Right Direction...

  • It should be easy to not care about the newline flag.

  • It should be nearly as easy to care about the newline flag.

Carrying the newline bit on the cell itself gives you what looks like a surface-level "feature" of making an "as-is" copy, however:

  • Exposes the implementation choices

  • Is helpless in most any scenario besides just making a copy, and gives unpredictable effects elsewhere.

  • Doesn't copy the "N + 1" flag, in any case

1 Like

So I think you pretty clearly want the line structure to stay consistent in that example.

Here's A Tougher Question: Where To INSERT?

What do you expect from:

>> block: [
      a b c
      d e f
   ]

insert skip block 3 '[x y]

Your choices are:

  a b c [x y]
  d e f

  a b c
  [x y] d e f

Perhaps (skip block 3) Itself Informs This?

Let's look at what you get with a couple of different skips:

>> block: [
       a b c
       d e f
   ]

>> skip block 1
== [b c
    d e f
]

>> skip block 3
== [
    d e f
]

It seems pretty clear in the skip block 1 case that you're right up to the edge of b, and inserting there should not insert a NEW-LINE marker.

If that "newline or not in the bracket" state is to have meaning then we can assume it would mean that your skip block 3 insertion should still be before the line break:

a b c [x y]
d e f

But this perspective is biased to the CELL_FLAG_NEWLINE_BEFORE world.

It says that skipping 3 elements put us before a newline (the CELL_FLAG_NEWLINE_BEFORE on d), not after a newline (which is what we'd have had if c had CELL_FLAG_NEWLINE_AFTER).

Somewhat Uncomfortable: Changing Things Behind You

What I'd observe is that regardless of whether I see either of these:

>> skip block 3
== [
    d e f
]

>> skip block 3
== [d e f
]

I'm looking at d e f. And so if I say "insert data here" and it pushes it behind what I'm looking at, that feels a bit strange.

It starts to feel like INSERT:BEFORE is a slightly different intent. (This is starting to feel like writing a word processor and worrying over caret control.)

I have an inkling that the nature of "showing everything after" by default, as well as the implementation detail I mentioned about enumeration, favors the concept of modeling the newline as "living on" the cell from the previous line.

Giving INSERT, CHANGE, and APPEND a :BEFORE flag could help when you've cued up to a position and you want to make it clear that you want additional material to live on whatever line the last thing was.

The decisions I've described (making the bit a property of the List and not the Value) make the choice something the user doesn't need to be concerned with. When you pick a value out of a block, the flag doesn't get carried.

So what to pick is purely in service of the implementation.

The example I give may seal the deal on why CELL_FLAG_NEWLINE_BEFORE is the right choice...

If you notice something here, when we get a List_At(...) it tells us what the tail is, and what the address of the Cell is at the index for the list...

But we don't know where the head is.

Of course we can find out. But the "simple" interface used by most iterations doesn't give it to you.

So if you ask a function like NEW-LINE? "is there a line marker at the current position", it would have to use something besides List_At(...)... to find out if it was allowed to peek backwards behind the current position to know.

That's fairly good news, in the sense that it would be work to change the flag, and I'm happy to find a solid reason why not to.

1 Like