-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add ManagedTableDataset for managed Delta Lake tables in Databricks #127
Conversation
hey @noklam just saw your comment in the other PR. I did see those two datasets, this will be more focused on Databricks Unity catalog tables. The SparkDataSet and DeltaTableDataSets are for interfacing with files directly. Both can be used on databricks but are intended for different purposes. |
7121847
to
9b43324
Compare
…atasets allows users to interface with Unity catalog tables in Databricks to both read and write. Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
…org#99) * Add non-spark related test changes Replace kedro.pipeline.Pipeline with kedro.pipeline.modular_pipeline.pipeline factory. This is for symmetry with changes made to the main kedro library. Signed-off-by: Adam Farley <adamfrly@gmail.com> Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
* fix links * fix dill links Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
* Fix docs formatting and phrasing for some datasets Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * Manually fix files not resolved with patch command Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * Apply fix from kedro-org#98 Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> --------- Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
* bump version and update release notes * fix pylint errors Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
* Prefix Docker plugin name with "Kedro-" in usage message Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
Signed-off-by: wmoreiraa <walber3@gmail.com> Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
…dro-org#54) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
…ro-org#118) Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
* [kedro-docker] Layers size optimization (kedro-org#92) * [kedro-docker] Layers size optimization Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Adjust test requirements Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Skip coverage check on tests dir (some do not execute on Windows) Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Update .coveragerc with the setup Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Fix bandit so it does not scan kedro-datasets Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Fixed existence test Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Check why dir is not created Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Kedro starters are fixed now Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Increased no-output-timeout for long spark image build Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Spark image optimized Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> * Linting Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> * Switch to slim image always Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> * Trigger build Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> * Use textwrap.dedent for nicer indentation Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> * Revert "Use textwrap.dedent for nicer indentation" This reverts commit 3a1e3f8. Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> * Revert "Revert "Use textwrap.dedent for nicer indentation"" This reverts commit d322d35. Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> * Make tests read more lines (to skip all deprecation warnings) Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Release Kedro-Docker 0.3.1 (kedro-org#94) * Add release notes for kedro-docker 0.3.1 Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Update version in kedro_docker module Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Bump version and update release notes (kedro-org#96) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Make the SQLQueryDataSet compatible with mssql. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Add one test + update RELEASE.md. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Add missing pyodbc for tests. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Mock connection as well. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Add more dates parsing for mssql backend (thanks to fgaudindelrieu@idmog.com) Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Fix an error in docstring of MetricsDataSet (kedro-org#98) Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Bump relax pyarrow version to work the same way as Pandas (kedro-org#100) * Bump relax pyarrow version to work the same way as Pandas We only use PyArrow for `pandas.ParquetDataSet` as such I suggest we keep our versions pinned to the same range as [Pandas does](https://github.com/pandas-dev/pandas/blob/96fc51f5ec678394373e2c779ccff37ddb966e75/pyproject.toml#L100) for the same reason. As such I suggest we remove the upper bound as we have users requesting later versions in [support channels](https://kedro-org.slack.com/archives/C03RKP2LW64/p1674040509133529) * Updated release notes Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Add missing type in catalog example. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Add one more unit tests for adapt_mssql. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * [FIX] Add missing mocker from date test. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * [TEST] Add a wrong input test. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Add pyodbc dependency. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * [FIX] Remove dict() in tests. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Change check to check on plugin name (kedro-org#103) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Set coverage in pyproject.toml (kedro-org#105) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Move coverage settings to pyproject.toml (kedro-org#106) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Replace kedro.pipeline with modular_pipeline.pipeline factory (kedro-org#99) * Add non-spark related test changes Replace kedro.pipeline.Pipeline with kedro.pipeline.modular_pipeline.pipeline factory. This is for symmetry with changes made to the main kedro library. Signed-off-by: Adam Farley <adamfrly@gmail.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Fix outdated links in Kedro Datasets (kedro-org#111) * fix links * fix dill links Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Fix docs formatting and phrasing for some datasets (kedro-org#107) * Fix docs formatting and phrasing for some datasets Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * Manually fix files not resolved with patch command Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * Apply fix from kedro-org#98 Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> --------- Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Release `kedro-datasets` `version 1.0.2` (kedro-org#112) * bump version and update release notes * fix pylint errors Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Bump pytest to 7.2 (kedro-org#113) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Prefix Docker plugin name with "Kedro-" in usage message (kedro-org#57) * Prefix Docker plugin name with "Kedro-" in usage message Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Keep Kedro-Docker plugin docstring from appearing in `kedro -h` (kedro-org#56) * Keep Kedro-Docker plugin docstring from appearing in `kedro -h` Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * [kedro-datasets ] Add `Polars.CSVDataSet` (kedro-org#95) Signed-off-by: wmoreiraa <walber3@gmail.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Remove deprecated `test_requires` from `setup.py` in Kedro-Docker (kedro-org#54) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * [FIX] Fix ds to data_set. Signed-off-by: Yassine Alouini <yalouini@idmog.com> --------- Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Co-authored-by: Mariusz Strzelecki <szczeles@gmail.com> Co-authored-by: Jannic <37243923+jmholzer@users.noreply.github.com> Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com> Co-authored-by: OKA Naoya <pn11@users.noreply.github.com> Co-authored-by: Joel <35801847+datajoely@users.noreply.github.com> Co-authored-by: adamfrly <45516720+adamfrly@users.noreply.github.com> Co-authored-by: Sajid Alam <90610031+SajidAlamQB@users.noreply.github.com> Co-authored-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Co-authored-by: Walber Moreira <58264877+wmoreiraa@users.noreply.github.com> Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
… file path (kedro-org#114) * Add databricks deployment check and automatic DBFS path addition Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Add newline at end of file Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Remove spurious 'not' Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Move dbfs utility functions from SparkDataSet Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Add edge case logic to _build_dbfs_path Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Add test for dbfs path construction Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Linting Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Remove spurious print statement :) Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Add pylint disable too-many-public-methods Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Move tests into single method to appease linter Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Modify prefix check to /dbfs/ Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Modify prefix check to /dbfs/ Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Make warning message clearer Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Add release note Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Fix linting Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Update warning message Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Modify log warning level to error Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Modify message back to warning, refer to undefined behaviour Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Modify required prefix to /dbfs/ Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Modify doc string Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Modify warning message Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Split tests and add filepath to warning Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Modify f string in logging call Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Fix tests Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Lint Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> --------- Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
* Add Snowpark datasets Signed-off-by: Vladimir Filimonov <vladimir_filimonov@mckinsey.com> Signed-off-by: heber-urdaneta <heber_urdaneta@mckinsey.com> Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
* bump version and update release notes * fix pylint errors Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
* Migrate kedro-airflow to static metadata See kedro-org/kedro#2334. Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> * Add explicit PEP 518 build requirements for kedro-datasets Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> * Typos Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com> Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> * Remove dangling reference to requirements.txt Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> * Add release notes Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> --------- Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
* Migrate kedro-telemetry to static metadata See kedro-org/kedro#2334. Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> * Add release notes Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> --------- Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
* Add unit test + lint test on GA * trigger GA - will revert Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Fix lint Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Add end to end tests * Add cache key Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Add cache action Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Rename workflow files Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Lint + add comment + default bash Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Add windows test Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Update workflow name + revert changes to READMEs Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Add kedro-telemetry/RELEASE.md to trufflehog ignore Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Add pytables to test_requirements remove from workflow Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Revert "Add pytables to test_requirements remove from workflow" This reverts commit 8203daa. * Separate pip freeze step Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> --------- Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
* Migrate kedro-docker to static metadata See kedro-org/kedro#2334. Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> * Address packaging warning Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> * Fix tests Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> * Actually install current plugin with dependencies Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> * Add release notes Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> --------- Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
Currently opening gitpod will installed a Python 3.11 which breaks everything because we don't support it set. This PR introduce a simple .gitpod.yml to get it started. Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
* Update APIDataSet Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> * Sync ParquetDataSet Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> * Sync Test Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> * Linting Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> * Revert Unnecessary ParquetDataSet Changes Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> * Sync release notes Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> --------- Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
…edro-org#182) * bump tables version and remove step in workflow Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * revert version for linux Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * change version to 3.7 Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * remove extra line Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> --------- Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
* Create validate-pr-title.yaml * ci: add `ready_for_review` to the PR type triggers * Update validate-pr-title.yaml * revert: drop the `ready_for_review` type from list * ci: restrict the set of scopes to the plugin names Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
) * refactor TensorFlowModelDataset to Set matching consistency of all other kedro-datasets, DataSet should be camelcase. will be reverted in 0.19.0 Signed-off-by: BrianCechmanek <brian@hazy.com> * Introdcuing .gitpod.yml to kedro-plugins (kedro-org#185) Currently opening gitpod will installed a Python 3.11 which breaks everything because we don't support it set. This PR introduce a simple .gitpod.yml to get it started. Signed-off-by: BrianCechmanek <brian@hazy.com> * sync APIDataSet from kedro's `develop` (kedro-org#184) * Update APIDataSet Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> * Sync ParquetDataSet Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> * Sync Test Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> * Linting Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> * Revert Unnecessary ParquetDataSet Changes Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> * Sync release notes Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> --------- Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> Signed-off-by: BrianCechmanek <brian@hazy.com> * [kedro-datasets] Bump version of `tables` in `test_requirements.txt` (kedro-org#182) * bump tables version and remove step in workflow Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * revert version for linux Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * change version to 3.7 Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * remove extra line Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> --------- Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> Signed-off-by: BrianCechmanek <brian@hazy.com> * refactor tensorflowModelDataset casing in datasets setup.py Signed-off-by: BrianCechmanek <brian@hazy.com> * add tensorflowmodeldataset bugfix to release.md Signed-off-by: BrianCechmanek <brian@hazy.com> * Update all the doc reference with TensorFlowModelDataSet Signed-off-by: Nok <nok.lam.chan@quantumblack.com> --------- Signed-off-by: BrianCechmanek <brian@hazy.com> Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> Signed-off-by: Nok <nok.lam.chan@quantumblack.com> Co-authored-by: Nok Lam Chan <mediumnok@gmail.com> Co-authored-by: Ankita Katiyar <110245118+ankatiyar@users.noreply.github.com> Co-authored-by: Nok <nok.lam.chan@quantumblack.com> Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
Co-authored-by: Jannic <37243923+jmholzer@users.noreply.github.com> Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
I made a few changes:
|
Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
Closing this in favour of #206, which has a clean commit history, has signed commits and is based on the latest commit in |
Description
Creating first of few PRs to add functionality for Databricks in Kedro datasets. This PR includes the ManagedTableDataset which will allow users to interface with managed Delta tables in Databricks or locally in PySpark.
Development notes
Changes include a net new dataset, databricks.ManagedTableDataSet, which allows users to interface with managed delta tables inside of Databricks.
Checklist
RELEASE.md
file