A list of data analysis, data engineering, and data science tooling and infrastructure.
Imported from a gist.
- Mortar Data
Acquired by Datadog in February 2015 - Sense.io
Acquired by Cloudera in March 2016 - Databricks Cloud
- H20.ai "Sparkling Water"
- Domino Data Lab
- Mode Analytics
- Periscope Data
- Hosted Jupyter
- Azure
- Google Cloud
- FloydHub
- Stitch
- Panoply
- Yhat ScienceOps
- Yhat Bandit
- dbt: data build tool by Fishtown Analytics
- Kensu Adalog
- DataScience.com
- Neptune Machine Learning Lab
by deepsense.ai - Dataiku DSS
"A collaborative data science software platform for teams of data scientists, data analysts, and engineers to explore, prototype, build, and deliver their own data products more efficiently."
- AWS Data Pipeline
- AWS Batch
- Google Cloud Dataflow
- AWS Glue
"A Fully Managed ETL Service"
- Spotify's Luigi
- AirBnb's Airflow
- Coursera's Dataduct
- Pinball by Pinterest
- Apache Crunch
"Simple and Efficient MapReduce Pipelines"
- Styx by Spotify
"A batch job scheduler for Kubernetes"
- Alooma
"Your Data Pipeline as a Service"