Rethinking the ODBC Dialect: Inline Parameters

Graham's PatientDB project uses a fair amount of SQL, and he decided he wanted to use this dialect feature.

Most obvious problem: Not all SQL is LOAD-able

There are syntaxes for types that involve a WORD! followed by a GROUP! with no space between, like VARCHAR(255). (for instance)

We're in this unhappy situation of being able to LOAD most of SQL, but missing just enough such that you have to fall back on using strings here and there.

Parameter Substitution Doesn't Work On Table Names

I did not know this--but you can't use the ? substitution technique on table names:

https://stackoverflow.com/q/1208442/

So that puts you back in the situation of having a name in a variable that you know is supposed to be a table name, but if spliced might be interpreted as SQL code...leading to bugs or a way to do an injection attack.

To deal with this I made it so that tied blocks ($[...]) have a purpose in splicing string content into the query. So you can say SELECT * FROM $[table-name]. But if table-name turned out to be "patients; DROP TABLE patients" you could lose a table...so this needs to be used with care when the strings are coming from random places.

Strings Are Conflicted In Purpose

On the one hand, SQL needs string literals. On the other hand, we need a way to represent patterns that aren't LOAD-able. So which of these two purposes do we apply strings for?

It's annoying to have to write:

SELECT * FROM patients WHERE medication = -["ASPIRIN"]-;

But if we render "ASPIRIN" as "ASPIRIN" in the SQL statement--with the quotes--how do you get things like VARCHAR(20), without the quotes?

We could pick another string type for SQL code splices, like TAG! for instance.

    CREATE TABLE vaccines (
        id INTEGER GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
        created TIMESTAMP DEFAULT current_timestamp,
        name <VARCHAR(128)>,  ; Weird, but...
        vaxdate DATE
    )

We can't use TUPLE! because table.field is a syntax that actually gets used, so we can't moosh together VARCHAR.(128) as VARCHAR(128). Maybe we could, but just in the cases of a GROUP! on the right? :frowning:

Or VARCHAR/128 could render as VARCHAR(128).

This really does make me struggle with whether to say we need the "FUSED!" proposal. But that's still not going to get us there.

No matter what, we won't get 100% SQL compatibility unless we use strings. We might be able to cook up some kind of interpolation, but that runs up against the problem of needing to be able to fully and correctly parse SQL in order to be sure that the escaping isn't done in places it shouldn't...a tougher task.

Would Like This Unified With QUERY

I've mentioned that if we're going to have some kind of SQL dialect thing, we should make it all consistent.

But here we've got a situation where we are blindly passing through strings. We don't know what's a keyword and what isn't. So we couldn't enforce a rule like saying "field names must be QUOTED!" if we wanted to.

...but at least we have COMMA!. The comma data type has really been a winner, and I can't think of anytime I've felt a regret about its addition.

Ultimately, The Good And The Bad

The GOOD is just how easy it is to make such a dialect.

The BAD is that the devil is in the details. When you can't use the dialect you're back to using strings or some method of putting strings into the code. And SQL already has a need for the behavior of TEXT! strings.

All this is just real-world examination of why the ideals of Rebol don't play out so well in a world where Rebol doesn't control everything.

Text files are too saturated to be able to represent such composite interests; a graph structure and editor are needed to really pull off this kind of language intermingling.

3 Likes