title	sidebar_label
Spark Connectors for Pravega	Overview

This documentation describes the connector API and usage to read and write Pravega streams with Apache Spark.

Build end-to-end stream processing and batch pipelines that use Pravega as the stream storage and message bus, and Apache Spark for computation over the streams.

Getting Started
Samples
Configuration
Compatibility Matrix
Building the Connector
Features & Highlights
Limitations
Releases
Pre-Built Artifacts
Learn More
Support
About

Features & Highlights

Exactly-once processing guarantees for both Reader and Writer, supporting end-to-end exactly-once processing pipelines
A Spark micro-batch reader connector allows Spark streaming applications to read Pravega Streams. Pravega stream cuts (i.e. offsets) are used to reliably recover from failures and provide exactly-once semantics.
A Spark batch reader connector allows Spark batch applications to read Pravega Streams.
A Spark writer allows Spark batch and streaming applications to write to Pravega Streams. Writes are optionally contained within Pravega transactions, providing exactly-once semantics.
Seamless integration with Spark's checkpoints.
Parallel Readers and Writers supporting high throughput and low latency processing.

Releases

The latest releases can be found on the Github Release project page.

Pre-Built Artifacts

Releases are published to Maven Central. Spark and Gradle will automatically download the required artifacts. However, if you wish, you may download the artifacts manually using the links below.

The pre-built artifacts are available in the following locations:

Maven Central (releases)
- pravega-connectors-spark-3.1
- pravega-connectors-spark-2.4
GitHub Packages (snapshots)
- GitHub Packages

Support

Don’t hesitate to ask! Contact the developers and community on Slack (signup) if you need any help. Open an issue if you found a bug on Github Issues.

About

Spark Connectors for Pravega is 100% open source and community-driven. All components are available under Apache 2 License on GitHub.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

overview.md

overview.md

Features & Highlights

Releases

Pre-Built Artifacts

Support

About

Files

overview.md

Latest commit

History

overview.md

File metadata and controls

Features & Highlights

Releases

Pre-Built Artifacts

Support

About