-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix DateTime with zero microseconds encoding bug #138
Conversation
👋 @vlymar Thank you for the fix! I completely missed it! The tests are failing due to formatting checks. Elixir's # get encoded as a val like "1.6095024e9" which Clickhouse would be unable to parse to a DateTime.
utc_with_zero_microsec = ~U[2021-01-01 12:00:00.000000Z]
- naive_with_zero_microsec = utc_with_zero_microsec |> DateTime.shift_zone!(Ch.Test.clickhouse_tz(conn)) |> DateTime.to_naive()
+
+ naive_with_zero_microsec =
+ utc_with_zero_microsec
+ |> DateTime.shift_zone!(Ch.Test.clickhouse_tz(conn))
+ |> DateTime.to_naive()
+
assert Ch.query!(conn, "select {$0:DateTime64(6)} as d, toString(d)", [utc_with_zero_microsec]).rows ==
[[~N[2021-01-01 12:00:00.000000], to_string(naive_with_zero_microsec)]] I opened a PR to handle that (we don't need that long line) and to separate the new test: knocklabs#2 |
Maybe we can round when microseconds are zero? That is, just return seconds instead of doing the float encoding. defp encode_param(%DateTime{time_zone: "Etc/UTC", microsecond: microsecond} = dt) do
case microsecond do
{val, precision} when val > 0 and precision > 0 -> # note `val > 0`
size = round(:math.pow(10, precision))
unix = DateTime.to_unix(dt, size)
# to make encoding clear, let's just use arithmetic :)
seconds = div(unix, size)
fractional = rem(unix, size)
IO.iodata_to_binary([Integer.to_string(seconds), ?., String.pad_leading(Integer.to_string(fractional), precision)])
_ ->
dt |> DateTime.to_unix(:second) |> Integer.to_string()
end
end |
* some suggestions for tests * changelog
Oops, the new test actually needs that long line. I can add it and merge tomorrow. |
That works too. I actually started with that approach, then found Your new encoding approach is pretty cool too, though it did take me a few minutes to wrap my head around what |
Seems like we'll need to round to seconds when microseconds are zero as ClickHouse cannot parse
full testtest "DateTime" do
q =
from v in fragment(
# unix(~N[2023-12-16 08:55:47]) = 1702716947
"values ('d DateTime', ('1970-01-01 00:00:00'), (1702716947), ('2106-02-07 06:28:15'))"
),
select: v.d
assert TestRepo.one!(where(q, d: ^~N[1970-01-01 00:00:00])) == ~N[1970-01-01 00:00:00]
assert TestRepo.one!(where(q, d: ^~N[1970-01-01 00:00:00.000])) == ~N[1970-01-01 00:00:00]
assert TestRepo.one!(where(q, d: ^~U[1970-01-01 00:00:00Z])) == ~N[1970-01-01 00:00:00]
assert TestRepo.one!(where(q, d: ^~U[1970-01-01 00:00:00.000Z])) == ~N[1970-01-01 00:00:00]
assert TestRepo.one!(where(q, d: ^1_702_716_947)) == ~N[2023-12-16 08:55:47]
assert TestRepo.one!(where(q, d: ^1_702_716_947.0000)) == ~N[2023-12-16 08:55:47]
assert TestRepo.one!(where(q, d: ^~N[2023-12-16 08:55:47])) == ~N[2023-12-16 08:55:47]
assert TestRepo.one!(where(q, d: ^~U[2023-12-16 08:55:47Z])) == ~N[1970-01-01 00:00:00]
assert TestRepo.one!(where(q, d: ^~U[2023-12-16 08:55:47.000Z])) == ~N[1970-01-01 00:00:00]
assert TestRepo.one!(where(q, d: ^~N[2106-02-07 06:28:15])) == ~N[2106-02-07 06:28:15]
assert TestRepo.one!(where(q, d: ^~N[2106-02-07 06:28:15.000000])) == ~N[2106-02-07 06:28:15]
assert TestRepo.one!(where(q, d: ^~U[2106-02-07 06:28:15Z])) == ~N[2106-02-07 06:28:15]
assert TestRepo.one!(where(q, d: ^~U[2106-02-07 06:28:15.000000Z])) == ~N[2106-02-07 06:28:15]
assert TestRepo.one!(where(q, [v], v.d < ^~N[1970-01-01 00:00:00.123456])) ==
~N[1970-01-01 00:00:00]
assert TestRepo.one!(where(q, [v], v.d < ^~U[1970-01-01 00:00:00.123456Z])) ==
~N[1970-01-01 00:00:00]
assert TestRepo.all(where(q, [v], v.d > ^~N[1970-01-01 00:00:00.123456])) ==
[~N[2023-12-16 08:55:47], ~N[2106-02-07 06:28:15]]
assert TestRepo.all(where(q, [v], v.d > ^~U[1970-01-01 00:00:00.123456Z])) ==
[~N[2023-12-16 08:55:47], ~N[2106-02-07 06:28:15]]
assert TestRepo.all(where(q, [v], v.d < ^1_702_716_947.123)) ==
[~N[1970-01-01 00:00:00], ~N[2023-12-16 08:55:47]]
end From this PR: plausible/ecto_ch#143 Although ClickHouse can turn
|
Nice find @ruslandoga. I've updated this PR with your proposed implementation. It may be worth filing a bug report with Clickhouse? |
Thank you! knocklabs#3 should fix the CI. I'll check if the latest ClickHouse version also fails to parse |
The later ClickHouse versions don't have that problem. $ docker run -d -p 8123:8123 clickhouse/clickhouse-server
$ iex iex(1)> {:ok, pid} = Ch.start_link
{:ok, #PID<0.210.0>}
iex(2)> Ch.query(pid, "select {dt:DateTime64}", %{"dt" => 0.123})
{:ok, %Ch.Result{command: :select, num_rows: 1, rows: [[~N[1970-01-01 00:00:00.123000]]]}}
iex(3)> Ch.query(pid, "select version()")
{:ok, %Ch.Result{command: :select, num_rows: 1, rows: [["23.11.2.11"]]}}
In the CI we use |
@ruslandoga any other feedback on this PR? |
@vlymar no :) All's good! I just forgot to merge it... |
I've published the fix in v0.2.2. Thank you! |
thanks @ruslandoga, happy new year! |
Hi, we've been adopting plausible's ch and ecto_ch packages and are super grateful for your open source work!
I noticed a bug where queries with elixir DateTime params that had microsecond precision with zero microseconds (e.g.
~U[2023-12-13 00:00:00.000000Z]
) would fail because the ch driver would encode them with a value like1.6095024e9
which Clickhouse cannot parse.Float.to_string
returns the "shortest representation" possible, which in some cases will be scientific notation [docs].There are a few ways to solve this but I chose to use
:erlang.float_to_binary
to prevent the scientific notation encoding.Without the patch, the test case I added fails with:
I'm not an expert in clickhouse or elixir so please let me know if you have any feedback or concerns! I'm happy to make any changes. Or feel free to make any changes yourself.
Thanks!