Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

planner core pkg de-coupling and make it well organized #52714

Open
AilinKid opened this issue Apr 18, 2024 · 0 comments
Open

planner core pkg de-coupling and make it well organized #52714

AilinKid opened this issue Apr 18, 2024 · 0 comments
Assignees
Labels
sig/planner SIG: Planner type/enhancement The issue or PR belongs to an enhancement.

Comments

@AilinKid
Copy link
Contributor

AilinKid commented Apr 18, 2024

Enhancement

Refactoring

The current plan/core pkg is quite huge not as slim as we expected. The hybrid placement of logicalOp, physicalOp, property, task, logical-rewrite, build-phase, binder, cost, exhaustion, etc makes the hierarchy complicated. The boundary of them is also not as clear as we desired. I concluded that there is some reason for this phenomenon.

  • One is if Golang's structure wants to implement member functions with itself as a receiver, including the interface implementation, which can only be defined in the same pkg where the structure is defined. (here clearly the pkg is core) stackoverflow ref

    • eg: LogicalSort is defined in core pkg somewhere. If you want to implement (p *LogicalSort) buildKeyInfo for the sake of interface implementation or just member function of them, this implementation can only be done in the same core pkg, because Golang only allows local type as a receiver. Given buildKeyInfo is logical-rewrite-related code, it should be put into another part/component/pkg of core logic for clarity, that calls for the interface simplification or elimination of self-receiver's function.
  • Numerous utility and facility functions related to the core logic are defined and used in the core package. These include trace, hashcode, cost mode, selectivity, task, enumeration, and more. Moving or adding a new feature outside the core package may result in many internal structures or functions being exported out as well.

  • The biggest block is the import cycle problem. Say we create a new pkg called core2 and we want to add a new feature inside. The new feature couldn't work well without the current core context support, so by some means (exported original core structure or interface, using function pointer or something), we got a dependency from core2 -> core. However, this feature must also be called from core, cause core is a current unified portal for the entire optimization phase, so we got a dependency from core -> core2. As a result, hops, golang's import cycle is formed. The final solution to the problem is Back to integrate them all. Eventually, the core pkg becomes larger and larger.

current core pkg should be arranged as:

* /core
  * /base (base interface definition, physical, logical, task, etc)
  * /operator 
    * /baseimpl
    * /logicalimpl 
    * /physicalimpl 
  * /stats (the basic stats structure is defined in pkg/statistics, here is mainly about cross-est, row count derivation, row size, and selectivity, if possible, we can change the name from old cardinality to stats)
  * /cost (cost main contains the detail about how the cost model works, the formula inferred across different op, etc)
  * /dump (explain info related, plan replayer related, flatten plan related)
  * /rewrite (tidb currently does not have a clear part of this, but it should, like a constant prop, order eliminate, etc)
  * /rule (currently all the kinds of logical rules, all of them are treated as always-optimal)
  * /util (currently all the glue code should be extracted with effort to here)

Tips about resolving import cycle

  • if A depends on X, and B depends on X, while current X is defined in file of A or B.
    • you can extract X logic out as util usage
  • if pkg A depends on pkg B's function, and pkg B systematically depends on A. eg: executor depends on planner is reasonable, while if planner depends executor's function for some tricky handling. This is not will designed.
    • Anyway, you can resort the function pointer assignment in executor pkg's init function, init(){ planner.Fxxx = Fxx in executor}
    • If the dependency is only used as the field type for some new structure definition, you can use interface{} instead here.
  • for the others, you should rethink about how you strong dependency is come. And move those hard parts to the correct place.
  • Last but not least, if in one of your refactoring, the changes call for more changes endlessly, you should find the minimum set across Tips 1 & 2, and split them as the first part. Then we go back to Tips 3 if dependency still exists.

Process

@AilinKid AilinKid added the type/enhancement The issue or PR belongs to an enhancement. label Apr 18, 2024
@AilinKid AilinKid self-assigned this Apr 18, 2024
@AilinKid AilinKid added the sig/planner SIG: Planner label Apr 18, 2024
ti-chi-bot bot pushed a commit that referenced this issue May 8, 2024
ti-chi-bot bot pushed a commit that referenced this issue May 9, 2024
ti-chi-bot bot pushed a commit that referenced this issue May 13, 2024
ti-chi-bot bot pushed a commit that referenced this issue Sep 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/planner SIG: Planner type/enhancement The issue or PR belongs to an enhancement.
Projects
None yet
Development

No branches or pull requests

1 participant