Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kafka Consumer as Exasol IMPORT UDF #39

Merged
merged 4 commits into from
Oct 4, 2019
Merged

Conversation

morazow
Copy link
Contributor

@morazow morazow commented Oct 2, 2019

Kafka consumer as Exasol IMPORT UDF

Implements a Kafka consumer application as Exasol IMPORT user-defined-function (UDF).

This allows users to put SQL import statements into a cron job in order to regularly import data from Kafka topics into Exasol table.

Maintaining Kafka Topic Partition Offsets

The Kafka partition offsets are maintained inside Exasol table as additional table columns. This way, we ensure that each Kafka topic record is consumed once they are committed into Exasol table. Similarly, each time a consumer starts, they query the table for the max value of the offsets per each Kafka topic partition and start requesting records after this maximum offset from the table.

Initial fix for #40.

This is initial implementation for importing data from Kafka cluster.
Currently, it only supports simple string data using String Serializers
and Deserializers.
Adds the support for Schema Registry and importing Avro data using Kafka
Avro deserializers.

Additionally, fixes a bug with initial offset value. It should start
from `-1` if no data was imported before, so that next time we start
consuming from zero.
By providing the Kafka consumer keystore and truststore (JKS) files
stored in Exasol BucketFS bucket, the import UDF can use them to
establish secure communication with Kafka cluster.

Currently, only SSL protocol is supported
Previously, Avro ENUM type was considered a complex type and could not
be imported into Exasol table. This commit additionally adds support for
Avro Enum type import. Enum type values will be imported as string
(VARCHAR) value into Exasol table.
@morazow morazow merged commit a9fd757 into develop Oct 4, 2019
@morazow morazow deleted the feature/kafka-import branch October 28, 2019 10:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant