New Scanner Concept: "Spacecoding"

I had an idea based on use of the new lexical forms of space and newline, which can exist in arbitrary lengths:

>> form _
== " "

>> form #
== "^/"

>> form ____
== "    "

>> form ###
== -[


]-

What I thought was: "what if there were a way to ask to load source code and load the spaces and comments in it, but they used these forms... and made everything that wasn't a space quoted?"

So if your file looked like:

Rebol [Title: "My Sample"]

block [a  'b    ''c
    d _ ; Comment demo

(## e  f g)
  ]

You might be able to "spacecode" it and get something that looked a bit like:

['Rebol _ '['Title: _ '"My Sample"] #
#
'block _ '['a __ ''b ____ '''c #
____ 'd _ '_ _ -[; Comment demo]- #
#
'('## _ 'e __ 'f _ 'g) #
__ ] #]

(Notice the quoted '_ and '## representing literal tokens in source, and quote levels added onto the source elements, including additional levels on ones that were already quoted.)

That's what I would call "heavy" spacecoding. A "lite" form of spacecoding might assume that single spaces are presumed between elements, so you only get the literal items to account for any fewer spaces. You could also assume newline markers on elements are sufficient to account for newlines when things aren't double/triple spaced, etc.

That would look more like:

['Rebol '['Title: '"My Sample"]
#
'block '['a _ ''b ___ '''c
___ 'd '_ -[; Comment demo]-
#
'('## 'e _ 'f 'g)
__ ]]

Either way, all non-QUOTED! things represent stuff that would be ordinarily thrown out by TRANSCODE, with the goal of giving you a structure you could analyze and process while preserving all the whitespace.

Given that the results aren't exactly friendly to human reading either way, "Lite" spacecoding may be more of a hassle than it's worth since it would be irregular for machine processing.

It's an interesting idea for having a structure you could use to round-trip a file's spaces/newline/comments and still be able to work with the data in it (albeit awkwardly.)

But it was a little harder to graft into the scanner than I thought. Given there are higher priority things to think about I decided to just post about it.

1 Like