Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trino2Trino connector #21791

Open
mosabua opened this issue May 1, 2024 · 3 comments
Open

Trino2Trino connector #21791

mosabua opened this issue May 1, 2024 · 3 comments
Labels
roadmap Top level issues for major efforts in the project

Comments

@mosabua
Copy link
Member

mosabua commented May 1, 2024

Multiple proprietary implementations of a Trino2Trino connector exist in the community. This project aims to pull the various stakeholders together as discussed in prior community meetings.

Following are various characteristics and details for the initial implementation as well as future improvements and consideration.

  • Connector is a simple JDBC connector that exposes a secondary Trino as another data source.
  • Secondary Trino could be exposes as full cluster with numerous catalogs inside or each catalog from the secondary cluster could be catalog in the primary cluster.
  • All type mapping is from Trino data types to Trino types and hence simple to implement for all types.
  • The usual aspect that different connectors to different data sources support different data types still comes into play - only indirectly.
  • A typical query is potentially ineffective since individual tablescans and such are all treated separately .. and therefore go through JDBC
  • JDBC is a bottle neck in terms of data transfer speed
  • Implementation of pushdown will be very beneficial
  • Query pass through table function could be a potential way to ensure the complete query runs in the secondary cluster and only final results goes through JDBC to the primary cluster. Problem is that this requires the user to write a different SQL query
  • Maybe some magic hint could be implemented that causes complete pushdown.
  • Initial PR for the connector should be as simple as possible to ensure fast review and merge.
  • Only support read statements initially, more complex aspects can follow later.
@alee-x
Copy link

alee-x commented May 2, 2024

I'm still determining if this will be helpful, but I've been given the go-ahead to open-source the basic trino-to-trino connector my team developed for a research project.

The code has been made available here. Apologies that it's a fresh repo with a little history—we have an internal mono repo for custom Trino plugins, and this code was pulled out of there. It was written by DevOps, who mainly work in Python/Go/C++/Rust and have limited knowledge of Java, so if there's anything egregious, it's because we didn't know any better!

We're currently working on implementing pushdown and fixing the presently broken tests. I hope that the code will be a helpful starting point for a PR.

@mosabua
Copy link
Member Author

mosabua commented May 2, 2024

@alee-x thats awesome .. we are currently working on getting a more mature one from another group approved for sending a PR. The idea would be that we get the other version merged and then you and all others who have their own version can contribute specific improvements. That should result in a good connector for everyone to use and it will safe everyone the rebasing, and all other related work with maintaining a fork.

@mosabua
Copy link
Member Author

mosabua commented May 30, 2024

PR from https://github.com/sajjoseph/trino/tree/add-trino-to-trino-connector is coming soon

@mosabua mosabua added the roadmap Top level issues for major efforts in the project label Sep 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
roadmap Top level issues for major efforts in the project
Development

No branches or pull requests

2 participants