@rgchris's Iterator Framework (in Oldes Rebol3)

Definitely useful and implements a missing functionality. (People have reinvented the directory walk so many times, each version having its own bugs.)

One thing that jumps out at me is that your iterator namespace is not separate from the iterated-upon namespace. You have iterator functions and fields (like /NEXT) and then a per-iterator set of custom fields applicable to whatever the case calls for.

This might work when you're making a custom iterator class for every iteration, but to mesh well with more mundane generic collection iterators I think that would be too verbose (the generic iterator would have to say iter/value/field and couldn't just do iter(??)field for some short symbolic definition of ??). Also it might invite bugs when you can't easily null out the entirety of a state object in one go, but have to be sure you update some subset of the total fields without missing any.

C++ separates the spaces, and uses a syntax trick to make it not so terrible:

  • Anything you want to ask for on the iterator would be done with a dot access, like a normal member

  • Anything you want to do with the current element you do through a dereference step * or an arrow which folds the dereference and dot access together ->

Like this:

 iterator.act_on_iterator()  // use dot to call method on iterator itself

 Item item = *iterator;  // use dereference to get at current item

 String s1 = (*iterator).full  // one way to extract property of current item
 String s2 = iterator->full  // alternative syntax convenience for arrow

 iterator->delete_file();  // arrow can also call methods on current object

 for (Item i : iterator) { ... }  // modern C++ has range-based for loops

(I've made a separate thread for any topic specific to discussing influences of C++ iterators.)

Could We Generically "Dereference" Iterators?

Interestingly, Boris has written some of his thoughts on iterators... and suggests using GET to access the current item:

red-hof/code-analysis/iterators.md at master · greggirwin/red-hof · GitHub

Thinking of it this way, the analogue in Ren-C would be:

 decoder/some-method-on-iterator
 decoder.some-data-member-on-iterator

 (get decoder).some-data-member-on-item
 (get decoder)/some-method-on-item

Could @ Be "Dereference"?

We could repurpose @word to mean get (or more specificlaly, get the @word, e.g. "iterator-get")

The current @xxx usages most useful for the system is the lone @ (especially for API purposes) and the @BLOCK! for asking for inert behavior, e.g.

>> join text! [1 + 1]
== "2"

>> join text! @[1 + 1]
== "1+1"

>> block: [1 + 1]

>> decorate block '@
== @[1 + 1]

>> pin block
== @[1 + 1]

>> join pin block
== "1+1"

But beyond that, the inertness of @a and @a.b hasn't been all its cracked up to be, because anyone wanting to take advantage of that is dialecting, and probably wants $a and ^a to be looked at literally as well--so you're often creating a literal context (either putting them in a block, or having a function take its arguments literally).

(I considered the concept of @foo being "dereference foo" with @foo.bar being interpreted as (@foo).bar, but this abuse is not viable, for reasons that are beyond the scope of this post.)

Grafting That In

Here's what it might look like in Ren-C

import <r3:rgchris:html>

decoder: dom/walk load-html "<b>Foo"

neaten:pairs collect-while [
    decoder/next
][
    keep decoder.event  ; property of the iterator?
    keep any [
        (@decoder).name  ; property of the thing being iterated
        (@decoder).value
    ]
]

How's that look? :+1:

(Hopefully you're getting a sense of how nice it is to see when something is a refinement vs. a field vs. a function call...it's really hard for me now to suss out what historical Redbol code is doing, the dot-vs-colon-vs-slash really helps.)

The idea here would be that the iteration idea is standardized such that if you weren't interested in the iterator-specific properties (let's say EVENT doesn't matter), you could just speak in terms of the standard:

decoder: dom/walk load-html "<b>Foo"

collect [
     for-each 'node decoder [
         keep node.name
         keep node.value
     ]
]

So this is like the C++ concept, that the things that make it an iterator are about speaking GET and /NEXT. Your specific iterator here has a notion of events, but I don't think this fits all iteration scenarios...e.g. you may already have a defined object you're iterating.

1 Like