Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: query handler #951

Closed
wants to merge 11 commits into from

Conversation

waynexia
Copy link
Member

@waynexia waynexia commented Feb 7, 2023

I hereby agree to the terms of the GreptimeDB CLA

What's changed and what's your intention?

This PR refactors the SqlQueryHandler trait, and serval changes around it. In detail, it

  • Change SqlQueryHandler to QueryHandler. Because promql is also handled by it.
  • Add QueryLanguage enum, to represent both SQL and PromQL
  • use rustlint flag to config lints. It now contains clippy denied warnings and overrides a false positive rustc warning.
  • Change do_xxx in SqlQueryHandler to xxx, to align the naming rule with other places (do_xxx is often left for implementation's inner method).
  • statement_query now accepts QueryStatement rather than Statement, which is only for SQL.
  • SqlQueryInterceptor now accepts QueryLanguage/QueryStatement, for the same reason. cc @sunng87
  • Split the single query and multiple query methods. Like parse&parse_multiple, query/query_multiple. To reduce the redundant multiple statements handle logic, and only limit this behavior in the implementation of QueryHandler. cc @sunng87
  • Remove the type parameter Error in QueryHandler. Now it only returns servers::error, in which sub-crate the QueryHandler is defined. This place might not be appropriate, we may need to think about moving QueryHandler to another place. cc @MichaelScofield
  • Provide default implementations for methods query and query_multiple. Now the only two required methods for QueryHandler are statement_query and is_schema_valid. BTW, I doubt if QueryHandler is a good place for is_schema_valid?

The commits are organized by content. It might be easier to review by commits.

Checklist

  • I have written the necessary rustdoc comments.
  • I have added the necessary unit tests and integration tests.

Refer to a related PR or issue link (optional)

Follows #924

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
and change its parameter from Statement to QueryStatement

Add parse_multiple method to QueryLanguageParser

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
new trait method query and default implementation

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
add query and query_multiple, with default implementations

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
@waynexia waynexia changed the title Refactor query handler refactor: query handler Feb 7, 2023
@codecov
Copy link

codecov bot commented Feb 7, 2023

Codecov Report

Merging #951 (648bba9) into develop (ecadbc1) will decrease coverage by 0.10%.
The diff coverage is 68.00%.

@@             Coverage Diff             @@
##           develop     #951      +/-   ##
===========================================
- Coverage    85.74%   85.65%   -0.10%     
===========================================
  Files          444      445       +1     
  Lines        59828    59953     +125     
===========================================
+ Hits         51301    51350      +49     
- Misses        8527     8603      +76     
Flag Coverage Δ
rust 85.65% <68.00%> (-0.10%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/datanode/src/error.rs 65.62% <0.00%> (-0.70%) ⬇️
src/datanode/src/server.rs 0.00% <0.00%> (ø)
src/frontend/src/instance/standalone.rs 30.00% <0.00%> (-31.71%) ⬇️
src/frontend/src/server.rs 0.00% <0.00%> (ø)
src/servers/src/postgres.rs 100.00% <ø> (ø)
src/servers/src/http/handler.rs 46.15% <37.50%> (-1.68%) ⬇️
src/frontend/src/error.rs 14.28% <40.00%> (+3.17%) ⬆️
src/servers/src/http.rs 78.45% <50.00%> (+1.85%) ⬆️
src/datanode/src/instance/sql.rs 76.06% <52.00%> (+2.10%) ⬆️
src/frontend/src/instance/distributed.rs 86.51% <52.94%> (-1.60%) ⬇️
... and 39 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
@waynexia waynexia requested review from MichaelScofield and sunng87 and removed request for MichaelScofield February 7, 2023 08:09
@MichaelScofield
Copy link
Collaborator

Why remove the associate type Error in query handler? Then you have to write map_err(BoxedError::new) everywhere in implementors, which is not good I think.

@waynexia
Copy link
Member Author

waynexia commented Feb 7, 2023

Why remove the associate type Error in query handler? Then you have to write map_err(BoxedError::new) everywhere in implementors, which is not good I think.

Glad you asked @MichaelScofield. A concrete trait like QueryHandler has clear input and output type, as well as its behavior. Thus the error type should also be decided rather than a parameter. It's the implementation's duty to return a clear error type, which is servers::Error here. An example in the standard library is Read trait. It requires the implementation to return std::io::Result rather than their own. This makes that trait generic and easy to use.

But I do think return servers::Error is not good enough. It mixes too many error variants.

@MichaelScofield
Copy link
Collaborator

@waynexia servers::Errors are better confined to errors related to server implementations only, not to be mixed with "query handler"s' results. There are too many servers::Errors now. I'd prefer not to write that many map_err(BoxedError::new)s.

@waynexia
Copy link
Member Author

waynexia commented Feb 7, 2023

servers::Errors are better confined to errors related to server implementations only

I agree, so I'm thinking of moving QueryHandler to another place, along with its error variants. We may think of this later, maybe the query sub-crate is a good place? I haven't included this in the PR because it already ships lots of changes.

I'd prefer not to write that many map_err(BoxedError::new)s.

I think the problem is not that there are many BoxedError -- as I said in #812 (comment), we write BoxedError because we use snafu, and want to retrieves backtrace/context from a dynamic error type. And after all, this PR even doesn't change the outer error type, it was and still is ExecuteQuery, with a BoxedError.

What makes you think this PR adds too many BoxedError is, the return error type changes from an unknown parameter to a concrete type. And I think that the key point of this problem: "A concrete trait like QueryHandler has clear input and output type, as well as its behavior. Thus the error type should also be decided rather than a parameter.". Actually the BoxedError conversions are never increased. It was done in SqlQueryHandlerAdaptor. And I think the trait was not well-defined: the public APIs don't have a fixed error type, and you cannot use it easily in other places (and that's why SqlQueryHandlerAdaptor is needed, for converting the parameterized error type to a fixed servers::Error). So from my perspective removing the error type parameter accomplishes this trait, implementations know what error type it should return, and users know how that method would fail.

@MichaelScofield
Copy link
Collaborator

servers::Errors are better confined to errors related to server implementations only

I agree, so I'm thinking of moving QueryHandler to another place, along with its error variants. We may think of this later, maybe the query sub-crate is a good place? I haven't included this in the PR because it already ships lots of changes.

"query" crate is good to me

I'd prefer not to write that many map_err(BoxedError::new)s.

I think the problem is not that there are many BoxedError -- as I said in #812 (comment), we write BoxedError because we use snafu, and want to retrieves backtrace/context from a dynamic error type. And after all, this PR even doesn't change the outer error type, it was and still is ExecuteQuery, with a BoxedError.

What makes you think this PR adds too many BoxedError is, the return error type changes from an unknown parameter to a concrete type. And I think that the key point of this problem: "A concrete trait like QueryHandler has clear input and output type, as well as its behavior. Thus the error type should also be decided rather than a parameter.". Actually the BoxedError conversions are never increased. It was done in SqlQueryHandlerAdaptor. And I think the trait was not well-defined: the public APIs don't have a fixed error type, and you cannot use it easily in other places (and that's why SqlQueryHandlerAdaptor is needed, for converting the parameterized error type to a fixed servers::Error). So from my perspective removing the error type parameter accomplishes this trait, implementations know what error type it should return, and users know how that method would fail.

QueryHandler is hard to say a "concrete" trait. "query" is too broad (or abstract) to be confined to something specific. From the view of implementors, like Datanode instance, it's perfectly legit for it to return the errors defined in it's own crate, and produced while doing its own jobs. In this point, QueryHandler doesn't have the necessity to must use some concrete error type. Getting rid of the associated error type only makes our codes full of map_err(BoxedError::new)s, and making a hard time to developers. Yes, the count of BoxedError conversions are not reduced, but I'm happy not to write them everywhere.

As to BoxedError, it's rooted in our choice of using snafu from day 1. I think it's the price we must pay for it when we enjoy the convenience it brings. Fair.

@waynexia
Copy link
Member Author

waynexia commented Feb 8, 2023

QueryHandler is hard to say a "concrete" trait. "query" is too broad (or abstract) to be confined to something specific.

Unless it's less abstract than Read, right? For those traits who don't have certain input/output like TryFrom, having their own error parameter is reasonable, but this is not the situation for QueryHandler IMHO. And another key point I've not mentioned is, you cannot provide a default implementation for error parameters. I would also like to not write unimplemented!() everywhere.

I think it's the price we must pay for it when we enjoy the convenience it brings. Fair.

Absolutely, and all those .map_err(BoxedError::new) also come from day 1 -- you cannot make use of the compiler's ? to convert your error. So that's what I repeated, "the problem is not that there are many BoxedError". If you think there are many BoxedError, why not write some utils like boxed_context or just give up snafu? The key here is, QueryHandler should have a decided error type, for user to know how the execution would fail and for implementation to know what error it should return.

@MichaelScofield
Copy link
Collaborator

QueryHandler is hard to say a "concrete" trait. "query" is too broad (or abstract) to be confined to something specific.

Unless it's less abstract than Read, right? For those traits who don't have certain input/output like TryFrom, having their own error parameter is reasonable, but this is not the situation for QueryHandler IMHO. And another key point I've not mentioned is, you cannot provide a default implementation for error parameters. I would also like to not write unimplemented!() everywhere.

Don't see that many unimplemented!()s now, and doubt it would after this change. However, the needs of mapping to BoxedError is indeed reduced before. All my point is, there's little necessity to change to concrete type for the error defined in QueryHandler trait. That only makes code less concise, while more tedious to write. Besides, some archaeology suggests that associated type was introduced in late pre 1.0 release of Rust, with some issues not resolved (see rust-lang/rust#17307). You can't rule out the possibility that Read not using associated type is because of this.

I think it's the price we must pay for it when we enjoy the convenience it brings. Fair.

Absolutely, and all those .map_err(BoxedError::new) also come from day 1 -- you cannot make use of the compiler's ? to convert your error. So that's what I repeated, "the problem is not that there are many BoxedError". If you think there are many BoxedError, why not write some utils like boxed_context or just give up snafu? The key here is, QueryHandler should have a decided error type, for user to know how the execution would fail and for implementation to know what error it should return.

So snafu is not replaced, at least not now, and not the near future. If we have to bear with the burden of not able to use ?, why not make our life easier not to write that many map_errs? Also, QueryHandler does not really need a dedicated error type, associated error type does make its own point (as I said "From the view of implementors, like Datanode instance, it's perfectly legit for it to return the errors defined in it's own crate, and produced while doing its own jobs.").

@waynexia
Copy link
Member Author

waynexia commented Feb 8, 2023

Don't see that many unimplemented!()s now

I've removed ~20 unimplemented!() in this PR, you can check the diff for where they come from.

If the Read example cannot convince you what about other handlers? You can see what is more "tedious":

pub trait FrontendInstance:
GrpcQueryHandler<Error = Error>
+ SqlQueryHandler<Error = Error>
+ OpentsdbProtocolHandler
+ InfluxdbLineProtocolHandler
+ PrometheusProtocolHandler
+ ScriptHandler
+ PromqlHandler
+ Send
+ Sync
+ 'static
{
async fn start(&mut self) -> Result<()>;
}

And actually, I also plan to remove the parameter from GrpcQueryHandler later. Think about you can not use a GrpcQueryHandler without specifying its error type, is this necessary?

@MichaelScofield
Copy link
Collaborator

Looks like we are in a dilemma. @killme2008 @sunng87 @fengjiachun @v0y4g3r @evenyag what do you think?

@evenyag
Copy link
Contributor

evenyag commented Feb 9, 2023

@evenyag what do you think?

It's fine for me to define a concrete error type for handlers instead of using an associated error type.

Using BoxedError is then something unavoidable unless you use some opaque error like snafu::Whatever or anyhow.

Absolutely, and all those .map_err(BoxedError::new) also come from day 1 -- you cannot make use of the compiler's ? to convert your error.

In fact, this is mostly related to using an enum as an error, not the snafu crate itself, or BoxedError. snafu just helps you define the enum type. You still need to do the same thing to wrap your error type into a boxed one and then construct the HandlerError::External.

enum HandlerError {
    External {
        source: BoxedError,
        // Or source: Box<dyn Error + Send>,
    }
}

But there are some ways to simplify our codes (suppose we use HandlerError as the type name).

How to avoid map_err(BoxedError::new).context()?

From

Surely, you could implement From<YourError> for HandlerError and use ? to convert your error, with the restriction that you could only map the error into the External variant.

Actually, snafu also provides snafu(context(false)) attribute to get rid of one context() call.

Constructor

We could define a external method to construct the HandlerError so you just need to call map_err(HandleError::external).

Do not use enum

In fact, if our handler's error doesn't need variants, we could also use BoxedError, struct or opaque error type for HandleError.

#[derive(Debug, Snafu)]
pub struct HandlerError(BoxedError);

We still have status_code() method to get the error code, but get rid of the redundant error message in the HandlerError.

Then we could implement From<YourError> for the HandlerError and use ?.

box_context or boxed

Provides methods like

err.boxed().context()

or

err.box_context()

@sunng87
Copy link
Member

sunng87 commented Feb 9, 2023

on Error type of SqlQueryHandler

From my experience of using SqlQueryHandler, I can understand why @MichaelScofield chose to use associate Error for different implementation. Not only to avoid boxed type trick, but also because SqlQueryHandler acts as a facade between protocol implementation and query engine. Its implementation does not actually run the query, which is query engine's task. It decides how to call the query engine, for example:

  • datanode internal: call query engine and other statement handlers directly
  • standalone mode: call datanode's api
  • distributed mode: gather information from meta and do distributed read/write/ddls
  • testing mode: mock everything

From this perspective, I think it's reasonable to have associated error types because these different implementations are doing totally different tasks.

@waynexia
Copy link
Member Author

waynexia commented Feb 9, 2023

acts as a facade between protocol implementation and query engine. Its implementation does not actually run the query, which is query engine's task. It decides how to call the query engine, for example:

That's the root of why I decide to refactor this trait. When I try to add a new method in #924, I found it verbose to implement this trait. You should care about how to collect many results into one return value, lots of methods are unnecessary for mock structs, what Error should I map to this time etc. It also takes me some time to find where is the actual handler I need to modify.

But, as you said, the implementation should only care about how to call the query engine. So I start to refactor it to only need to provide one statement_query method, which defines how to execute ONE query statement in the query engine. And provides query, query_multiple default implementations (which require the error type to be fixed). Now only the FrontendInstance need to implement its own query_multiple for interceptor logic, and all others can benefit from this style. ~20 unimplemented!() are removed, and DatanodeInstance, DistInstance, StandaloneHandler etc need not to repeat their do_query().

different implementations are doing totally different tasks.

This doesn't matter, different implementation can still have their own error context. Nothing is changed except you need to add one more BoxedError. And in fact, even this is the same as before (was done in SqlQueryHandlerAdaptor). IMO this is a free improvement to both trait implementor and trait user.

@sunng87
Copy link
Member

sunng87 commented Feb 10, 2023

On design of QueryHandler. As we discussed, I start to realize this QueryHandler is a dispatcher. It sits between protocol servers and query engine. I think an ideal design is refactoring QueryHandler to QueryDispatcher, and unifies all servers handlers (sql, grpc, ...) together.

What protocol servers need:

  • verify user/pass/schema/catalog for incoming connections and requests
  • given a query string, either promql or sql, response a vector of results (http, mysql text, pg simple)
  • given a sql statement with placeholders, parse and cache it as plan, replace parameters, response result column defnitions and results (mysql binary, pg extended)
  • given a query string, response a result (prometheus remote read)

What new QueryDispatcher can provide:

1.given a statement, check the statement type, and dispatch to different handlers (datafusion query engine, or other table/db/data handlers)

  1. based on its context, to call query engine directly, or make distributed read/write

What QueryEngine and CatalogManager jump in for protocol servers:

  1. provide parser to parse query string to statement to logical plan. Also we can make parser associated with protocol server to support sql dialect. as Make query handler accept stmt instead of raw sql string #766
  2. provide catalog/schema api for privilege check

@sunng87
Copy link
Member

sunng87 commented Feb 13, 2023

@waynexia please go ahead and leave uncertain parts for future refactoring

@waynexia
Copy link
Member Author

After offline discussion with @MichaelScofield and @fengjiachun, we've reached an agreement that (1) push forward this PR as-is, and (2) find a way to alleviate the flooded .map_err(BoxedError::new).context(). I'll resolve conflicts after the necessary 0.2 release tasks get done.

@waynexia
Copy link
Member Author

Necessary changes are shipped in #1340 and #1351, closing this

@waynexia waynexia closed this Apr 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants