-
Notifications
You must be signed in to change notification settings - Fork 218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add regex search operator #2458
Conversation
let expr = match try_into_regex_function(expr, ctx)? { | ||
Ok(between) => return Ok(between), | ||
Err(expr) => expr, | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes unfortuanetely, this is checking each of the try_into_x
one after another. It is actually my attempt at matching.
Good news is that it is necessary only for things that was specifically want to convert to sql_parser
's AST or that need special handling. If it could be implemented with an s-string, it can go into std_impl.prql
.
So this PR could be done by just implementing std.regex_search
in std_impl.prql
, it we had some mechanism for changing the impl depending on the dialect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I wonder whether we've over-abstracted this, and there's simpler code, even if more verbose and a bit less performant.
e.g. looking at this for a few minutes (but this only does add
, would need to be extended to other binary ops, and doesn't handle an incorrect number of args)
ExprKind::BuiltInFunction { name, .. } if name == "std.and" => {
let op = BinaryOperator::Plus;
let strength = op.binding_strength();
let [left, right] = unpack(expr, STD_AND);
let left = translate_operand(left, strength, !op.associates_left(), ctx)?;
let right = translate_operand(right, strength, !op.associates_right(), ctx)?;
sql_ast::Expr::BinaryOp { left, right, op }
}
I don't have much confidence here though — raising since I'm still trying to go through the code and see if the perspective of someone worse at writing code can help! (And I won't pursue further unless requested)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's more concise. My helper functions were intended for also checking the structure of arguments (see IS NULL), and doing arbitrary stuff without copying before deciding that the expr should be unpacked and consumed.
Handling wrong number or type or args is nice to have, because there will be invalid RQs which we don't want to panic on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also wonder if it's more performant. Even if it's not allocating, it's calling many functions on each resolution.
(I realized this when I had https://github.com/PRQL/prql/pull/2458/files#diff-0a1f682019c767247e01027a9762cc4999669f393eb7971f095d81bea690163aR213-R219 before the let Some((decl, _)) = try_unpack(&expr, DECLS)? else {
— which panicked whenever mssql was running, even though it wasn't calling regex).
That said, I'm often surprised how cheap everything is relative to allocating.
But then I'm also often reminded of how cheap perf is relative to code complexity :)
Yes, I like it.
Sensitive. But there should be a way to make it insensitive. Bit I cannot think of one that'd I'd like.
Yes. |
Postgres uses |
Ready to merge! We can implement sqlite & mysql later — it requires a small refactoring since it's infix in those dialects |
I have some reservations about the language design of this, but the implementation is solid, so let's merge it. |
OK, we can revert as ever! |
Todo before merge:
~=
a good operator?=
Other questions:
"bob\smarley
doesn't need the extra\
?Lte
throughout; some parts were v clear, some I somewhat blindly copied.Err
for each one that fails before moving onto the next? https://github.com/PRQL/prql/compare/main...max-sixty:regex-search?expand=1#diff-0a1f682019c767247e01027a9762cc4999669f393eb7971f095d81bea690163aR96?