Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dialect: XTDB #521

Open
seancorfield opened this issue Jan 2, 2024 · 10 comments
Open

Dialect: XTDB #521

seancorfield opened this issue Jan 2, 2024 · 10 comments
Assignees
Labels
needs analysis I need to think about this!

Comments

@seancorfield
Copy link
Owner

Would it be worth having an :xtdb dialect that had :col-fn to turn / into $?

What about :foo/bar.quux-wibble? (or :foo.bar/quux-wibble) XTDB's docs suggest that becomes foo$bar$quux_wibble but HoneySQL supports namespace-qualifiers as tablenames, and also splits on dots so that would be "foo"."bar"."quux_wibble" (or "foo.bar"."quux_wibble" which would be an illegal SQL name) as things stand right now -- how would we tell the difference between something like table.the$key and table$the$key since the mapping (from / and . to $) is lossy?

user=> (sql/format :foo.bar/quux-wibble)
["\"foo.bar\".quux_wibble"] ; smart quoting
user=> (sql/format :foo.bar/quux-wibble {:quoted true})
["\"foo.bar\".\"quux-wibble\""] ; all quoting -> preserves -
user=> (sql/format :foo.bar/quux-wibble {:quoted true :quoted-snake true})
["\"foo.bar\".\"quux_wibble\""] ; all quoting -> maps - to _
user=> (sql/format :foo/foo.bar/quux-wibble {:quoted true :quoted-snake true})
["\"foo\".\"foo.bar/quux_wibble\""] ; namespace is up to first /
user=>
@seancorfield
Copy link
Owner Author

See also seancorfield/next.jdbc.xt#5 which is the reverse mapping.

@jarohen
Copy link

jarohen commented Jan 4, 2024

seancorfield added the needs analysis label

Heh, there's an understatement 😅 Yeah, table/column name normalisation has caused a fair bit of back-n-forth for us (and indeed, we're also still open to changes in XT). FWIW, our main aims/constraints were:

  • Complying with the SQL standard on identifier case-sensitivity and available punctuation.
  • Clojure users being able to use namespace-qualified table and column names as idiomatically as possible. Where possible, these are round-tripped correctly, although we were prepared to accept some lossiness in (at least, what we thought were) rarer cases.
  • SQL and non-Clojure users (incl. JSON) being able to access tables/columns inserted from Clojure (and vice versa) using casing/punctuation idiomatic to the individual platforms (i.e. preferably without needing to escape in common cases).

Aware that this has created quite a needle-threading exercise - would be interested to hear opinions on either the choice of constraints (which ones are more/less important, or any we've missed), or the approach we went with 🙂

@seancorfield
Copy link
Owner Author

@jarohen Thanks for the commentary/background. One of the big issues from my specific point of view (HoneySQL, next.jdbc) is that there's a lot of code out there now assuming :table/column -- because that's what next.jdbc produces by default and that's how HoneySQL behaves (by default, at least).

Being confronted with XTDB's SQL dialect where :some/thing is "just" a column name (mapped to some$thing) and with no real way to get table names to post-process column names (to match next.jdbc's default behavior), it's a bit of a puzzle as to what to do.

BTW, if you quote (strop) the names in SQL, XTDB accepts "some/thing" as a quoted column name and turns it into "some$thing" under the hood -- I was a bit surprised it accepted such column names at first: I hadn't quite made the connection with the mapping of / to $... and then was even more surprised when I saw that . also mapped to $ and that's when I realized it was a lossy mapping.

See also seancorfield/next-jdbc#269 which is essentially the same issue for the "friendly SQL functions" in next.jdbc which perform a subset of "HoneySQL-lite" work.

@jarohen
Copy link

jarohen commented Jan 9, 2024

Thanks Sean, noted 🙏 I don't have any immediate answers/suggestions for this, I'm afraid, will need some consideration. We'll keep this card on our issue board too, keep it fresh in our minds.

James

@seancorfield
Copy link
Owner Author

seancorfield/next.jdbc.xt#6 has additional commentary on a recent change in XTDB in this area but I wanted to highlight it in an open issue here: namely that SQL operations now return names containing / and - which don't round-trip in either HoneySQL or next.jdbc so the XDTB version of my usermanager example app is broken with no good way to fix it -- and data from XTDB SQL operations cannot be used as-is, even with forced quoting (because :table/column is a long-standing convention for both libraries and :xt/id breaks that).

@seancorfield
Copy link
Owner Author

Round-tripping has been restored in next.jdbc.xt so I'm going to stick with :xt$id and not attempt anything fancy in HoneySQL.

@seancorfield
Copy link
Owner Author

Tentative dialect definition:

               :xtdb      {:quote    #(strop \" % \")
                           :col-fn   #(if (keyword? %) (subs (str %) 1) (str %))
                           :parts-fn #(str/split % #"\.")}

If seancorfield/next-jdbc#269 goes ahead, HoneySQL should get an XTDB dialect.

@refset
Copy link

refset commented Feb 27, 2024

In case it helps/hinders, note that we are planning to soon support unqualified columns in XT2 also, see xtdb/xtdb#2524

@seancorfield
Copy link
Owner Author

For now, I'm working on the assumption that :foo/bar.quux/wibble would mean table foo/bar and column quux/wibble in this proposed XTDB dialect -- which isn't how other dialects work (although the NRQL dialect created a precedent for "non-standard" partitioning of names like this).

For an unqualified name, :quux/wibble would mean column quux/wibble preserving the / that XTQL allows -- but it would mean that you couldn't have dots (.) in table names or column names -- something has to be deemed a table.column delimiter and since xt/id etc exist as column names, it pretty much has to mean dot delimits table and column.

Note that :table.column is already the default in HoneySQL -- with :table/column existing as a convenience for working with next.jdbc and qualified column names. If you're working with next.jdbc.xt instead, right now it returns the $ form but if I tackle that next.jdbc issue linked above, it will be possible (and probably less confusing) to go back to the / form native to XTQL :)

@seancorfield
Copy link
Owner Author

I went ahead and implemented the next.jdbc.sql (builder) change to provide a way to allow qualified keywords to round-trip as table or column names with / in them -- or to provide your own mapping from foo/bar to a string such as foo$bar -- since that was the blocker for next.jdbc.xt in that situation.

I'll have to figure out what to do in HoneySQL to "match" that sort of behavior (so I'm still contemplating an XTDB dialect, esp. now you have a bunch of SQL extensions -- see #532 for some of that).

Accepting unqualified column names in SQL doesn't really make much difference, since that wasn't where the conflict was. Clojure users of XTDB are still going to assume :foo/bar is the same as foo$bar and will want to use it in HoneySQL when creating SQL for XTDB.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs analysis I need to think about this!
Projects
None yet
Development

No branches or pull requests

3 participants