Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Update references #476

Merged
merged 2 commits into from
Sep 21, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 20 additions & 19 deletions docs/modules/trino/pages/getting_started/installation.adoc
Original file line number Diff line number Diff line change
@@ -1,20 +1,22 @@
= Installation

On this page you will install the Stackable Operator for Trino as well as the commons and secret operator which are required by all Stackable Operators.
On this page you will install the Stackable Operator for Trino as well as the commons and secret operator which are
required by all Stackable Operators.

== Stackable Operators

There are two ways to install Stackable Operators:

1. Using xref:stackablectl::index.adoc[stackablectl]

2. Using Helm
. Using xref:management:stackablectl:index.adoc[stackablectl]
. Using Helm

=== stackablectl

The stackablectl command line tool is the recommended way to interact with operators and dependencies. Follow the xref:stackablectl::installation.adoc[installation steps] for your platform if you choose to work with stackablectl.
The `stackablectl` command line tool is the recommended way to interact with operators and dependencies. Follow the
xref:management:stackablectl:installation.adoc[installation steps] for your platform if you choose to work with
`stackablectl`.

After you have installed stackablectl, run the following command to install all operators necessary for Trino:
After you have installed `stackablectl`, run the following command to install all operators necessary for Trino:

[source,bash]
----
Expand All @@ -28,7 +30,7 @@ The tool will show
include::example$getting_started/code/install-operator-output.txt[tag=stackablectl-install-operators-output]
----

TIP: Consult the xref:stackablectl::quickstart.adoc[] to learn more about how to use stackablectl.
TIP: Consult the xref:management:stackablectl:quickstart.adoc[] to learn more about how to use `stackablectl`.

=== Helm

Expand All @@ -46,37 +48,36 @@ Then install the Stackable Operators:
include::example$getting_started/code/getting_started.sh[tag=helm-install-operators]
----

Helm will deploy the operators in a Kubernetes Deployment and apply the CRDs for the Trino service (as well as the CRDs for the required operators). You are now ready to deploy Trino in Kubernetes.
Helm will deploy the operators in a Kubernetes Deployment and apply the CRDs for the Trino service (as well as the CRDs
for the required operators). You are now ready to deploy Trino in Kubernetes.

== Optional installation steps

Some Trino connectors like `hive` or `iceberg` work together with the Apache Hive metastore and S3 buckets.
For these components extra steps are required.
Some Trino connectors like `hive` or `iceberg` work together with the Apache Hive metastore and S3 buckets. For these
components extra steps are required.

* a Stackable Hive metastore
* an accessible S3 bucket
** an end-point, and access- and secret-keys
** data in the bucket (we use the https://archive.ics.uci.edu/ml/datasets/iris[Iris] dataset here)
* the following are optional
** a Stackable xref:secret-operator::index.adoc[Secret Operator] for certificates when deploying for TLS
** a Stackable xref:commons-operator::index.adoc[Commons Operator] for certificates when deploying for TLS authentication
** (for authorization): a Stackable xref:opa::index.adoc[OPA Operator][OPA-Operator]
** the https://repo.stackable.tech/#browse/browse:packages:trino-cli%2Ftrino-cli-363-executable.jar[Trino CLI] to test SQL queries
** a Stackable xref:secret-operator:index.adoc[Secret Operator] for certificates when deploying for TLS
** a Stackable xref:commons-operator:index.adoc[Commons Operator] for certificates when deploying for TLS authentication
** (for authorization): a Stackable xref:opa:index.adoc[OPA Operator][OPA-Operator]
** the https://repo.stackable.tech/#browse/browse:packages:trino-cli%2Ftrino-cli-363-executable.jar[Trino CLI] to test
SQL queries

=== S3 bucket

Please refer to the S3 provider.

=== Hive operator

Please refer to the xref:hive::index.adoc[Hive Operator] docs.

Both Hive and Trino need the same S3 authentication.
Please refer to the xref:hive:index.adoc[Hive Operator] docs. Both Hive and Trino need the same S3 authentication.

=== OPA operator

Please refer to the xref:opa::index.adoc[OPA Operator] docs.

Please refer to the xref:opa:index.adoc[OPA Operator] docs.

== What's next

Expand Down
33 changes: 22 additions & 11 deletions docs/modules/trino/pages/index.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -2,22 +2,30 @@
:description: The Stackable Operator for Trino is a Kubernetes operator that can manage Trino clusters. Learn about its features, resources, dependencies and demos, and see the list of supported Trino versions.
:keywords: Stackable Operator, Trino, Kubernetes, k8s, operator, data science, data exploration, SQL, engineer, big data, CRD, StatefulSet, ConfigMap, Service, Druid, Trino, S3, Superset

This is an operator for Kubernetes that can manage https://https://trino.io/[Trino] clusters.
Trino is an open-source distributed SQL query engine that enables high-speed analytics of large datasets from multiple data sources using SQL queries. This operator enables you to manage your Trino instances on Kubernetes efficiently.
:k8s-crs: https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/

This is an operator for Kubernetes that can manage https://https://trino.io/[Trino] clusters. Trino is an open-source
distributed SQL query engine that enables high-speed analytics of large datasets from multiple data sources using SQL
queries. This operator enables you to manage your Trino instances on Kubernetes efficiently.

== Getting started

Follow the xref:getting_started/index.adoc[Getting started guide] to start using the Stackable Operator for Trino on your Kubernetes cluster. It will guide you through the installation process and help you run your first Trino queries on Kubernetes.
Follow the xref:getting_started/index.adoc[Getting started guide] to start using the Stackable Operator for Trino on
your Kubernetes cluster. It will guide you through the installation process and help you run your first Trino queries on
Kubernetes.

== Operator model

The Operator manages Kubernetes resources in sync with https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/[custom resources] defined by you, the user.
The Operator manages Kubernetes resources in sync with {k8s-crs}[custom resources] defined by you, the user.

=== Custom resources

The Trino Operator manages two custom resources: The _TrinoCluster_ and xref:concepts.adoc#catalogs[_TrinoCatalogs_]. The TrinoCluster resource allows for the specification of a Trino cluster. Two xref:concepts:roles-and-role-groups.adoc[roles] are defined: `coordinators` and `workers`.
The Trino Operator manages two custom resources: The _TrinoCluster_ and xref:concepts.adoc#catalogs[_TrinoCatalogs_].
The TrinoCluster resource allows for the specification of a Trino cluster. Two
xref:concepts:roles-and-role-groups.adoc[roles] are defined: `coordinators` and `workers`.

To connect to data sources the TrinoCatalogs are used. Have a look at the xref:usage_guide/catalogs/index.adoc[catalog overview] to find out which types of data sources are supported by the Stackable platform.
To connect to data sources the TrinoCatalogs are used. Have a look at the xref:usage_guide/catalogs/index.adoc[catalog
overview] to find out which types of data sources are supported by the Stackable platform.

=== Resources

Expand All @@ -27,13 +35,16 @@ image::trino_overview.drawio.svg[A diagram depicting the Kubernetes resources cr

== Demos

The xref:stackablectl::demos/trino-taxi-data.adoc[] demo uses Trino together with xref:hive:index.adoc[Apache Hive] to access the prominent New York Taxi dataset. xref:superset:index.adoc[Apache Superset] is then used to read the data from the Trino instance via SQL and visualize it.

The xref:stackablectl::demos/data-lakehouse-iceberg-trino-spark.adoc[] demo showcases a data Lakehouse with multiple datasets. Again Trino is used to enable SQL acces to the data. The xref:stackablectl::demos/trino-iceberg.adoc[] demo is a subset of the Lakehouse demo, focusing just on Apache Iceberg integration.

The xref:stackablectl::demos/spark-k8s-anomaly-detection-taxi-data.adoc[] also uses Trino to enable SQL access to data but also shows xref:opa:index.adoc[OpenPolicyAgent] integration for xref:usage_guide/security.adoc#authorization[authorization].
The xref:demos:trino-taxi-data.adoc[] demo uses Trino together with xref:hive:index.adoc[Apache Hive] to access the
prominent New York Taxi dataset. xref:superset:index.adoc[Apache Superset] is then used to read the data from the Trino
instance via SQL and visualize it.

The xref:demos:data-lakehouse-iceberg-trino-spark.adoc[] demo showcases a data Lakehouse with multiple datasets. Again
Trino is used to enable SQL access to the data. The xref:demos:trino-iceberg.adoc[] demo is a subset of the Lakehouse
demo, focusing just on Apache Iceberg integration.

The xref:demos:spark-k8s-anomaly-detection-taxi-data.adoc[] also uses Trino to enable SQL access to data but also shows
xref:opa:index.adoc[OpenPolicyAgent] integration for xref:usage_guide/security.adoc#authorization[authorization].

== Supported Versions

Expand Down