You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
OpenSearch aggregation framework is the simplified MPP frameworks and does not support shuffle stage.
OpenSearch query framework missing key feature support, E.g. JOIN, Subquery.
We found these work have been solved by general purpose data preprocessing system, E.g. Presto, Spark, Trino. And build such a platform require years to mature.
Idea
The initial idea is
Using SQL as interface.
Leverage spark as query/compute execution engine.
High level diagram:
User Experience
User configure SPARK cluster as computation resource, E.g. https://SPARK:7707.
User submit SQL to OpenSearch cluster use _plugins/_sql REST API.
SQL engine parse and analysis the SQL query.
SQL engine decide whether route the query to SPARK cluster or run query locally.
Introduction
We received a feature request for query execution on object stores in OpenSearch.
We have investigated the possibility to build a new solution for OpenSearch uses and leverage object store as storage. Which includes
We found the challenges are
We found these work have been solved by general purpose data preprocessing system, E.g. Presto, Spark, Trino. And build such a platform require years to mature.
Idea
The initial idea is
High level diagram:
User Experience
The text was updated successfully, but these errors were encountered: