One interesting file format is that of Xara (XAR) vector images. They consist of an eight-byte header followed by a collection of records. Records consist of two little-endian 32-bit integers denoting the record type and size followed by the record content. Presumably easy to traverse—however: the wrinkle here is that there is one record type that indicates that subsequent records are encoded in a single Deflate stream that terminates with both the end of the Deflate stream and a record type that indicates the end of the compressed section.
This is a simplified way of looking at it (each tag representing a whole record):
---- regular byte stream ----
<8-byte header>
<title>
<metadata>
<start-compressed-section>
---- deflate compressed byte stream ----
<thing>
<thing>
...
<end-compressed-section>
---- regular byte stream ----
<thing>
<thing>
---- eof ----
This type of switch could be handled with filters.
file-content: make filter! [
source: open/direct %image.xar
]
content: file-content
consume content #{... magic number ...}
until [
type: consume content 'unsigned-32-le
size: consume content 'unsigned-32-le
switch type [
types/start-compressed-section [
content: make filter! [
type: 'deflate
source: file-content
]
]
types/end-compressed-section [
assert [
tail? content
]
content: file-content
]
]
... record dispatcher ...
type == types/end-of-file
]