Sparkly

Helpers & syntax sugar for PySpark. There are several features to make your life easier:

Definition of spark packages, external jars, UDFs and spark options within your code;
Simplified reader/writer api for Cassandra, Elastic, MySQL, Kafka;
Testing framework for spark applications.

More details could be found in the official documentation.

Installation

Sparkly itself is easy to install:

pip install pyspark  # pick your version
pip install sparkly (compatible with spark >= 2.4)

Getting Started

Here is a small code snippet to show how to easily read Cassandra table and write its content to ElasticSearch index:

from sparkly import SparklySession


class MySession(SparklySession):
    packages = [
        'datastax:spark-cassandra-connector:2.0.0-M2-s_2.11',
        'org.elasticsearch:elasticsearch-spark-20_2.11:6.5.4',
    ]


if __name__ == '__main__':
    spark = MySession()
    df = spark.read_ext.cassandra('localhost', 'my_keyspace', 'my_table')
    df.write_ext.elastic('localhost', 'my_index', 'my_type')

See the online documentation for more details.

Testing

To run tests you have to have docker and docker-compose installed on your system. If you are working on MacOS we highly recommend you to use docker-machine. As soon as the tools mentioned above have been installed, all you need is to run:

make test

Supported Spark Versions

At the moment we support:

sparkly >= 2.7 | Spark 2.4.x

sparkly 2.x | Spark 2.0.x and Spark 2.1.x and Spark 2.2.x

sparkly 1.x | Spark 1.6.x

Name		Name	Last commit message	Last commit date
Latest commit History 152 Commits
.github/workflows		.github/workflows
bin		bin
docs/source		docs/source
sparkly		sparkly
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
.readthedocs.yml		.readthedocs.yml
CHANGELOG.md		CHANGELOG.md
Dockerfile		Dockerfile
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.rst		README.rst
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt
requirements_dev.txt		requirements_dev.txt
requirements_extras.txt		requirements_extras.txt
setup.cfg		setup.cfg
setup.py		setup.py
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sparkly

Installation

Getting Started

Testing

Supported Spark Versions

About

Releases

Packages

Languages

License

sathya-reddy-m/sparkly

Folders and files

Latest commit

History

Repository files navigation

Sparkly

Installation

Getting Started

Testing

Supported Spark Versions

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages