Reduce Bloat, Increase Wikification: Web Error Messages

hostilefork · April 5, 2025, 10:49am

We have a situation where giving descriptive errors with strings is often very possible. But each time you write a long string, that's taking up space:

if (attr == tail or not Is_Integer(attr))
     return Error_Bad_Value(attr);
if (*raw_addr != 0)
    return Error_User("FFI: duplicate raw memory");
*raw_addr = cast(uint_64_t, Cell_Int64(attr));
if (*raw_addr == 0)
    return Error_User("FFI: void pointer illegal for raw memory");

It's also hardcoding the message into the executable.

I've been leaning toward the idea that instead of strings embedded into the program or scripts, that instead errors are just their symbol IDs.

if (attr == tail)
     return ERROR(REACHED_END);
if (not Is_Integer(attr)
     return ERROR(EXPECTED, CANON(INTEGER_X));
if (*raw_addr != 0)
    return ERROR_FFI(DUPLICATE_RAW_MEMORY);
*raw_addr = cast(uint_64_t, Cell_Int64(attr));
if (*raw_addr == 0)
    return ERROR_FFI(VOID_POINTER_RAW_MEMORY);

Some error IDs would be for built-ins, others would be extension specific.

But the idea here is that:

 ERROR(REACHED_END);
 => Make_Error(Canon_Symbol(SYM_REACHED_END))

The error creation can be variadic, and take parameters, with validation in debug builds as it does today that the parameter count is right for the error.

So what you'd wind up getting from something like ERROR(EXPECTED, CANON(INTEGER_X)) would be:

== #[error! [
    id: 'expected
    arg1: 'integer!
    near: '[...code location...]
    where: '[...call stack...]
    file: %file-name.r
    line: 1020
]]

And if you didn't have an internet connection, the message would be something like:

** Error: [/expected integer!]
** Near: ...
** Where: ...
** File: %file-name.r
** Line: 1020
(i) Connect to the internet for more descriptive error messages

My Theory Is...

If you have an internet connection, then a message can be pulled from the network on demand for what should be displayed.
- Ideally you should also have an option to go to a wiki page where people write about things that may cause that error, what it means, and what to do about it.
If you don't have an internet connection, then you are likely operating in a restricted and spare environment of some kind, in which you won't see the absence of a long and flowery message to be that big a deal. Having the ID and the error arguments should be enough.

If you wanted to build an executable that pulled all the known error messages off the network (or from the local files that are used to produce answers on the network) and snapshotted them to ship in the EXE, you could do that. There'd be an option for it.

How To Turn Error IDs Into URLs?

This is something that's been nagging me for a while.

R3-Alpha's error table %errors.r has categories in it:

Throw: [...]
Note: [...]
Syntax: [...]
Script: [...]
Math: [...]
Access: [...]
Command: [...]
resv700: [...]
User: [...]
Internal: [...]

This was based on the idea that giving errors numbers was meaningful or useful. I don't think the numbering is meaningful, and the categories may not be either.

It seems to me that the only thing that would be useful would be some way of grouping errors together so that you'd be able to find them on a server.

This all requires some more thinking, but it's a direction I've been leaning in.

bradrn · April 5, 2025, 12:04pm

I strongly oppose this. It means that if the user ever loses their Internet connection, they would also lose their error messages. Except that’s exactly the time when error messages are most vital!

In particular this is a mistaken assumption. An Internet connection can be lost for a variety of reasons, none of which necessarily has anything to do with a ‘restricted environment’.

hostilefork · April 5, 2025, 12:10pm

If people are concerned about this, then wherever you get the interpreter from should be able to give you errors as well at that time.

Like I say, it can be a build option to pack things into the EXE (like picking which extensions to build in). I just think the core should be centered around not building it in.

In particular... the webassembly build on the web, is generally going to be able to pull the error messages it needs on demand...so there's no reason that the core interpreter come packed with a ton of strings embedded in it. Faster download if not, when you're not going to encounter 99% of those errors in a given run.

The modularization I'm going after will be such that it won't have a console in it by default either--so you can embed the interpreter in a web page and use services from it, and then if you need to break into a console it can download the side module.

But again--with that--you can choose to build it in.

I'm just talking about decoupling things, and limiting what's actually required in the source to compile it. How people package it is up to them.