Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CALCITE-6527] Add DATE_ADD function (enabled in Spark library) #3969

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

caicancai
Copy link
Member

@caicancai caicancai commented Sep 15, 2024

The following is my recent review of why the BC date is equal to 0000.
In calcite, the date function will go through a series of conversions, such as the bigquery date_add function will be converted to DATETIME_PLUS later,and the current adapted DATE_ADD is no exception.

After being converted to DATETIME_PLUS, it seems that it will be converted to PLUS operator operation later, and finally enter RexlmpTable.DatetimeArithmeticImplementor to perform parameter type conversion and partial calculation, return normalize(typeName, Expressions.add(trop0, trop1)). Of course, there may be higher levels of abstraction and conversion later, but unfortunately I haven't found it yet.

In this process, I found that calcite did not restrict the result set of the date function to the date BC date and 0000. I raised jira and emailed to seek answers, there is Julian's explanation in jira, and calcite has its own DATE data rules.

Why 0000 and BC dates are not allowed as parameters? This is because calcite has already made restrictions on dates when parsing.

@@ -2150,8 +2152,11 @@ private static class TimestampAddConvertlet implements SqlRexConvertlet {
switch (call.operandCount()) {
case 2:
// Oracle-style 'ADD_MONTHS(date, integer months)'
if (call.getOperator() == SqlLibraryOperators.ADD_MONTHS) {
qualifier = new SqlIntervalQualifier(TimeUnit.MONTH, null, SqlParserPos.ZERO);
if (call.getOperator() == SqlLibraryOperators.ADD_MONTHS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use another condition to do this. Don't need to merge it.

Copy link
Member Author

@caicancai caicancai Sep 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optimized the code to make it more readable. Do you think it's acceptable?

f.checkScalar("date_add(timestamp '2016-02-22 13:00:01', '-2.0')",
"2016-02-20",
"DATE NOT NULL");
f.checkScalar("date_add(timestamp '2016-02-22 13:00:01', -2)",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add test like:

spark-sql> select date_add('2016-02-22 13:00:01', -2);
2016-02-20
spark-sql> select date_add('13:00:01', -2);
NULL
Time taken: 0.195 seconds, Fetched 1 row(s)
spark-sql> select date_add('1', -2);
NULL
spark-sql> select date_add(1, -2);
Error in query: cannot resolve 'date_add(1, -2)' due to data type mismatch: argument 1 requires date type, however, '1' is of int type.; line 1 pos 7;
'Project [unresolvedalias(date_add(1, -2), None)]
+- OneRowRelation

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NobiGo Calcite has its own date type rules, which are different from Spark. I am wondering whether it is necessary to return null in select date_add('13:00:01', -2)🤔.

The dataricks documentation states that only date expressions are accepted,https://docs.databricks.com/en/sql/language-manual/functions/date_add.html.

I think you can refer to the NVL2 function I adapted before, https://issues.apache.org/jira/browse/CALCITE-6397

@caicancai
Copy link
Member Author

@NobiGo no other comment. squashed

@@ -2804,6 +2804,7 @@ In the following:
| b s | DATE_FROM_UNIX_DATE(integer) | Returns the DATE that is *integer* days after 1970-01-01
| p r | DATE_PART(timeUnit, datetime) | Equivalent to `EXTRACT(timeUnit FROM datetime)`
| b | DATE_ADD(date, interval) | Returns the DATE value that occurs *interval* after *date*
| s | DATE_ADD(date, num_days) | Returns the DATE that is *num_days* after *date*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use camelCase for arguments, not snake_case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@caicancai
Copy link
Member Author

@julianhyde @NobiGo Thanks for your review

Copy link

sonarcloud bot commented Sep 20, 2024

@NobiGo NobiGo added the LGTM-will-merge-soon Overall PR looks OK. Only minor things left. label Sep 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
LGTM-will-merge-soon Overall PR looks OK. Only minor things left.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants