Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[kedro-datasets ] Add Polars.CSVDataSet #95

Merged
merged 13 commits into from
Feb 9, 2023
Merged

[kedro-datasets ] Add Polars.CSVDataSet #95

merged 13 commits into from
Feb 9, 2023

Conversation

wmoreiraa
Copy link
Contributor

@wmoreiraa wmoreiraa commented Jan 7, 2023

Signed-off-by: wmoreiraa walber3@gmail.com

Description

Introduce python-polars to Kedro Datasets.
https://www.pola.rs/benchmarks.html

Development notes

  • Add polars.CSVDataSet
  • Same tests (copy paste with some minor tweaks to pandas testcase)

TODO on this PR:

  • CSVDataSet

All of those using only the eager I/O.

Checklist

  • Opened this PR as a 'Draft Pull Request' if it is work-in-progress
  • Updated the documentation to reflect the code changes
  • Added a description of this change in the relevant RELEASE.md file
  • Added tests to cover my changes

@wmoreiraa wmoreiraa changed the title polars csvdataset [kedro-datasets ] Polars csvdataset Jan 7, 2023
@wmoreiraa
Copy link
Contributor Author

I'm figuring it out how to solve the unit test that is failing.

@merelcht
Copy link
Member

I'm figuring it out how to solve the unit test that is failing.

Right now all unit tests are passing, it's just the DCO check that's failing. You can find instructions here https://github.com/kedro-org/kedro-plugins/pull/95/checks?check_run_id=10658463390 on how to solve it.

@wmoreiraa
Copy link
Contributor Author

Thanks, @merelcht ! My e-mail signoff was wrong, ive commited the fix now. Do you think its valid to finish this PR only with CSVDataSet?

@wmoreiraa
Copy link
Contributor Author

@merelcht , Ive followed the intructions and rewrote my signoff email, but still DCO failing for the same reason. Dont know whats wrong now.

@merelcht
Copy link
Member

@merelcht , Ive followed the intructions and rewrote my signoff email, but still DCO failing for the same reason. Dont know whats wrong now.

Hmm odd.. sometimes it's very hard to get it working 😅 I can manually approve it when the PR is ready to merge. So don't worry about it.

@merelcht
Copy link
Member

Do you think its valid to finish this PR only with CSVDataSet?

Yes I think that's fine! Any additional dataset is valuable 🙂 In the future you (or another contributor) can then build more polar datasets.

@wmoreiraa wmoreiraa marked this pull request as ready for review January 18, 2023 19:17
@merelcht merelcht changed the title [kedro-datasets ] Polars csvdataset [kedro-datasets ] Add Polars.CSVDataSet Jan 20, 2023
Copy link
Member

@merelcht merelcht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the addition @wmoreiraa !

I've added some comments mainly around updating of the docstring. And could you also add this change to the release notes? 🙂

kedro-datasets/kedro_datasets/polars/csv_dataset.py Outdated Show resolved Hide resolved
kedro-datasets/kedro_datasets/polars/ipc_dataset.py Outdated Show resolved Hide resolved
kedro-datasets/kedro_datasets/polars/csv_dataset.py Outdated Show resolved Hide resolved
kedro-datasets/kedro_datasets/polars/csv_dataset.py Outdated Show resolved Hide resolved
wmoreiraa and others added 8 commits January 23, 2023 07:36
Signed-off-by: wmoreiraa <walber3@gmail.com>
Signed-off-by: wmoreiraa <walber3@gmail.com>
Signed-off-by: wmoreiraa <walber3@gmail.com>
Signed-off-by: wmoreiraa <walber3@gmail.com>
Signed-off-by: wmoreiraa <walber3@gmail.com>
Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: wmoreiraa <walber3@gmail.com>
Signed-off-by: wmoreiraa <walber3@gmail.com>
* Bump relax pyarrow version to work the same way as Pandas

We only use PyArrow for `pandas.ParquetDataSet` as such I suggest we keep our versions pinned to the same range as [Pandas does](https://github.com/pandas-dev/pandas/blob/96fc51f5ec678394373e2c779ccff37ddb966e75/pyproject.toml#L100) for the same reason.

As such I suggest we remove the upper bound as we have users requesting later versions in [support channels](https://kedro-org.slack.com/archives/C03RKP2LW64/p1674040509133529)

* Updated release notes

Signed-off-by: wmoreiraa <walber3@gmail.com>
@wmoreiraa
Copy link
Contributor Author

I think everything's fine now? @merelcht
Thx for the comments!

Copy link
Member

@merelcht merelcht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates @wmoreiraa. I left a couple more questions, but most importantly you shouldn't bump the version of kedro-datasets in this PR. We'll do that when we do a release.

kedro-datasets/RELEASE.md Outdated Show resolved Hide resolved
kedro-datasets/kedro_datasets/__init__.py Outdated Show resolved Hide resolved
kedro-datasets/kedro_datasets/polars/csv_dataset.py Outdated Show resolved Hide resolved
kedro-datasets/kedro_datasets/polars/csv_dataset.py Outdated Show resolved Hide resolved
Signed-off-by: wmoreiraa <walber3@gmail.com>
Copy link
Member

@merelcht merelcht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this contribution @wmoreiraa 😄 🎉

Add dataset headers
Signed-off by: wmoreiraa <walber3@gmail.com>

Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
@wmoreiraa
Copy link
Contributor Author

Just the first one! I've been impacted by layoffs, and now I'm full time job hunting / available, might as well do some LinkedIn post showcasing some polars + kedro.

@merelcht
Copy link
Member

Just the first one! I've been impacted by layoffs, and now I'm full time job hunting / available, might as well do some LinkedIn post showcasing some polars + kedro.

I'm so sorry to hear that! Good luck with the job hunting 🍀 Your contributions on Kedro are very much appreciated ❤️

@datajoely
Copy link
Contributor

Good luck @wmoreiraa

@merelcht merelcht removed the request for review from AhdraMeraliQB February 6, 2023 15:18
Copy link
Contributor

@AhdraMeraliQB AhdraMeraliQB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution @wmoreiraa! 🌟 Happy to approve with just the minor change noted.

kedro-datasets/kedro_datasets/polars/csv_dataset.py Outdated Show resolved Hide resolved
@datajoely
Copy link
Contributor

I love this and want to see it in Kedro as soon as possible.

My only question is if we shouldn't approach this on a file type by file type basis and should in fact approach this the same way we do spark.SparkDataSet and pandas.GenericDataSet?

At the time of writing the followingread_* methods can be imported from polars.io:

  • read_avro
  • read_csv
  • read_csv_batched
  • read_delta
  • read_excel
  • read_ipc
  • read_json
  • read_ndjson
  • read_parquet
  • read_sql

In terms of write targets the DataFrame class only supports a subset of these:

  • DataFrame.write_ipc
  • DataFrame.write_parquet
  • DataFrame.write_json
  • DataFrame.write_ndjson
  • DataFrame.write_avro

Is there any merit in abstracting this into a generic approach?

Changing equals to frame_equal

Co-authored-by: Ahdra Merali <90615669+AhdraMeraliQB@users.noreply.github.com>
@wmoreiraa
Copy link
Contributor Author

Thank you for the fix! @AhdraMeraliQB .
@datajoely I'm going to be honest with you that I didn't know about the existence of these datasets until now, I'll give it a look at the code, but the direction seems abour right.

@merelcht
Copy link
Member

merelcht commented Feb 9, 2023

I'd suggest getting this merged in first since it's ready now and then look at a generic approach later.

@wmoreiraa
Copy link
Contributor Author

@merelcht @datajoely, I've finished the test cases on the Generic Approach. Should I open a second PR then?

@datajoely
Copy link
Contributor

As merel said let's get this one in and then do a follow-up?

@wmoreiraa
Copy link
Contributor Author

Fine for me, ima just wait and then fork again.

@merelcht
Copy link
Member

merelcht commented Feb 9, 2023

Fine for me, ima just wait and then fork again.

Thank you! I've just resolved some minor merge conflicts and will merge this PR as soon as the builds finish successfully 🙂

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
@merelcht merelcht merged commit 61e0f08 into kedro-org:main Feb 9, 2023
@wmoreiraa wmoreiraa mentioned this pull request Feb 13, 2023
3 tasks
yassineAlouini pushed a commit to yassineAlouini/kedro-plugins that referenced this pull request Feb 24, 2023
Signed-off-by: wmoreiraa <walber3@gmail.com>

Signed-off-by: Yassine Alouini <yalouini@idmog.com>
yassineAlouini pushed a commit to yassineAlouini/kedro-plugins that referenced this pull request Feb 24, 2023
Signed-off-by: wmoreiraa <walber3@gmail.com>

Signed-off-by: Yassine Alouini <yalouini@idmog.com>
AhdraMeraliQB pushed a commit that referenced this pull request Feb 27, 2023
* [kedro-docker] Layers size optimization (#92)

* [kedro-docker] Layers size optimization

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Adjust test requirements

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Skip coverage check on tests dir (some do not execute on Windows)

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Update .coveragerc with the setup

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Fix bandit so it does not scan kedro-datasets

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Fixed existence test

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Check why dir is not created

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Kedro starters are fixed now

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Increased no-output-timeout for long spark image build

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Spark image optimized

Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>

* Linting

Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>

* Switch to slim image always

Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>

* Trigger build

Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>

* Use textwrap.dedent for nicer indentation

Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>

* Revert "Use textwrap.dedent for nicer indentation"

This reverts commit 3a1e3f8.

Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>

* Revert "Revert "Use textwrap.dedent for nicer indentation""

This reverts commit d322d35.

Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>

* Make tests read more lines (to skip all deprecation warnings)

Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>
Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Release Kedro-Docker 0.3.1 (#94)

* Add release notes for kedro-docker 0.3.1

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

* Update version in kedro_docker module

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Bump version and update release notes (#96)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Make the SQLQueryDataSet compatible with mssql.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Add one test + update RELEASE.md.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Add missing pyodbc for tests.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Mock connection as well.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Add more dates parsing for mssql backend (thanks to fgaudindelrieu@idmog.com)

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Fix an error in docstring of MetricsDataSet (#98)

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Bump relax pyarrow version to work the same way as Pandas (#100)

* Bump relax pyarrow version to work the same way as Pandas

We only use PyArrow for `pandas.ParquetDataSet` as such I suggest we keep our versions pinned to the same range as [Pandas does](https://github.com/pandas-dev/pandas/blob/96fc51f5ec678394373e2c779ccff37ddb966e75/pyproject.toml#L100) for the same reason.

As such I suggest we remove the upper bound as we have users requesting later versions in [support channels](https://kedro-org.slack.com/archives/C03RKP2LW64/p1674040509133529)

* Updated release notes

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Add missing type in catalog example.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Add one more unit tests for adapt_mssql.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* [FIX] Add missing mocker from date test.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* [TEST] Add a wrong input test.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Add pyodbc dependency.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* [FIX] Remove dict() in tests.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Change check to check on plugin name (#103)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Set coverage in pyproject.toml (#105)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Move coverage settings to pyproject.toml (#106)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Replace kedro.pipeline with modular_pipeline.pipeline factory (#99)

* Add non-spark related test changes
Replace kedro.pipeline.Pipeline with
kedro.pipeline.modular_pipeline.pipeline factory.
This is for symmetry with changes made to the main kedro library.

Signed-off-by: Adam Farley <adamfrly@gmail.com>

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Fix outdated links in Kedro Datasets (#111)

* fix links

* fix dill links

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Fix docs formatting and phrasing for some datasets (#107)

* Fix docs formatting and phrasing for some datasets

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* Manually fix files not resolved with patch command

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* Apply fix from #98

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

---------

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Release `kedro-datasets` `version 1.0.2` (#112)

* bump version and update release notes

* fix pylint errors

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Bump pytest to 7.2 (#113)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Prefix Docker plugin name with "Kedro-" in usage message (#57)

* Prefix Docker plugin name with "Kedro-" in usage message

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Keep Kedro-Docker plugin docstring from appearing in `kedro -h` (#56)

* Keep Kedro-Docker plugin docstring from appearing in `kedro -h`

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* [kedro-datasets ] Add `Polars.CSVDataSet` (#95)

Signed-off-by: wmoreiraa <walber3@gmail.com>

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Remove deprecated `test_requires` from `setup.py` in Kedro-Docker (#54)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* [FIX] Fix ds to data_set.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

---------

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>
Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>
Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Co-authored-by: Mariusz Strzelecki <szczeles@gmail.com>
Co-authored-by: Jannic <37243923+jmholzer@users.noreply.github.com>
Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Co-authored-by: OKA Naoya <pn11@users.noreply.github.com>
Co-authored-by: Joel <35801847+datajoely@users.noreply.github.com>
Co-authored-by: adamfrly <45516720+adamfrly@users.noreply.github.com>
Co-authored-by: Sajid Alam <90610031+SajidAlamQB@users.noreply.github.com>
Co-authored-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Co-authored-by: Walber Moreira <58264877+wmoreiraa@users.noreply.github.com>
dannyrfar pushed a commit to dannyrfar/kedro-plugins that referenced this pull request Mar 13, 2023
Signed-off-by: wmoreiraa <walber3@gmail.com>

Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
dannyrfar pushed a commit to dannyrfar/kedro-plugins that referenced this pull request Mar 13, 2023
* [kedro-docker] Layers size optimization (kedro-org#92)

* [kedro-docker] Layers size optimization

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Adjust test requirements

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Skip coverage check on tests dir (some do not execute on Windows)

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Update .coveragerc with the setup

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Fix bandit so it does not scan kedro-datasets

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Fixed existence test

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Check why dir is not created

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Kedro starters are fixed now

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Increased no-output-timeout for long spark image build

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Spark image optimized

Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>

* Linting

Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>

* Switch to slim image always

Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>

* Trigger build

Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>

* Use textwrap.dedent for nicer indentation

Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>

* Revert "Use textwrap.dedent for nicer indentation"

This reverts commit 3a1e3f8.

Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>

* Revert "Revert "Use textwrap.dedent for nicer indentation""

This reverts commit d322d35.

Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>

* Make tests read more lines (to skip all deprecation warnings)

Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>
Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Release Kedro-Docker 0.3.1 (kedro-org#94)

* Add release notes for kedro-docker 0.3.1

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

* Update version in kedro_docker module

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Bump version and update release notes (kedro-org#96)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Make the SQLQueryDataSet compatible with mssql.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Add one test + update RELEASE.md.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Add missing pyodbc for tests.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Mock connection as well.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Add more dates parsing for mssql backend (thanks to fgaudindelrieu@idmog.com)

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Fix an error in docstring of MetricsDataSet (kedro-org#98)

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Bump relax pyarrow version to work the same way as Pandas (kedro-org#100)

* Bump relax pyarrow version to work the same way as Pandas

We only use PyArrow for `pandas.ParquetDataSet` as such I suggest we keep our versions pinned to the same range as [Pandas does](https://github.com/pandas-dev/pandas/blob/96fc51f5ec678394373e2c779ccff37ddb966e75/pyproject.toml#L100) for the same reason.

As such I suggest we remove the upper bound as we have users requesting later versions in [support channels](https://kedro-org.slack.com/archives/C03RKP2LW64/p1674040509133529)

* Updated release notes

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Add missing type in catalog example.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Add one more unit tests for adapt_mssql.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* [FIX] Add missing mocker from date test.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* [TEST] Add a wrong input test.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Add pyodbc dependency.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* [FIX] Remove dict() in tests.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Change check to check on plugin name (kedro-org#103)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Set coverage in pyproject.toml (kedro-org#105)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Move coverage settings to pyproject.toml (kedro-org#106)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Replace kedro.pipeline with modular_pipeline.pipeline factory (kedro-org#99)

* Add non-spark related test changes
Replace kedro.pipeline.Pipeline with
kedro.pipeline.modular_pipeline.pipeline factory.
This is for symmetry with changes made to the main kedro library.

Signed-off-by: Adam Farley <adamfrly@gmail.com>

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Fix outdated links in Kedro Datasets (kedro-org#111)

* fix links

* fix dill links

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Fix docs formatting and phrasing for some datasets (kedro-org#107)

* Fix docs formatting and phrasing for some datasets

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* Manually fix files not resolved with patch command

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* Apply fix from kedro-org#98

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

---------

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Release `kedro-datasets` `version 1.0.2` (kedro-org#112)

* bump version and update release notes

* fix pylint errors

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Bump pytest to 7.2 (kedro-org#113)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Prefix Docker plugin name with "Kedro-" in usage message (kedro-org#57)

* Prefix Docker plugin name with "Kedro-" in usage message

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Keep Kedro-Docker plugin docstring from appearing in `kedro -h` (kedro-org#56)

* Keep Kedro-Docker plugin docstring from appearing in `kedro -h`

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* [kedro-datasets ] Add `Polars.CSVDataSet` (kedro-org#95)

Signed-off-by: wmoreiraa <walber3@gmail.com>

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Remove deprecated `test_requires` from `setup.py` in Kedro-Docker (kedro-org#54)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* [FIX] Fix ds to data_set.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

---------

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>
Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>
Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Co-authored-by: Mariusz Strzelecki <szczeles@gmail.com>
Co-authored-by: Jannic <37243923+jmholzer@users.noreply.github.com>
Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Co-authored-by: OKA Naoya <pn11@users.noreply.github.com>
Co-authored-by: Joel <35801847+datajoely@users.noreply.github.com>
Co-authored-by: adamfrly <45516720+adamfrly@users.noreply.github.com>
Co-authored-by: Sajid Alam <90610031+SajidAlamQB@users.noreply.github.com>
Co-authored-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Co-authored-by: Walber Moreira <58264877+wmoreiraa@users.noreply.github.com>
Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
dannyrfar pushed a commit to dannyrfar/kedro-plugins that referenced this pull request Mar 21, 2023
Signed-off-by: wmoreiraa <walber3@gmail.com>

Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
dannyrfar pushed a commit to dannyrfar/kedro-plugins that referenced this pull request Mar 21, 2023
* [kedro-docker] Layers size optimization (kedro-org#92)

* [kedro-docker] Layers size optimization

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Adjust test requirements

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Skip coverage check on tests dir (some do not execute on Windows)

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Update .coveragerc with the setup

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Fix bandit so it does not scan kedro-datasets

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Fixed existence test

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Check why dir is not created

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Kedro starters are fixed now

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Increased no-output-timeout for long spark image build

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Spark image optimized

Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>

* Linting

Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>

* Switch to slim image always

Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>

* Trigger build

Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>

* Use textwrap.dedent for nicer indentation

Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>

* Revert "Use textwrap.dedent for nicer indentation"

This reverts commit 3a1e3f8.

Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>

* Revert "Revert "Use textwrap.dedent for nicer indentation""

This reverts commit d322d35.

Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>

* Make tests read more lines (to skip all deprecation warnings)

Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>
Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Release Kedro-Docker 0.3.1 (kedro-org#94)

* Add release notes for kedro-docker 0.3.1

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

* Update version in kedro_docker module

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Bump version and update release notes (kedro-org#96)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Make the SQLQueryDataSet compatible with mssql.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Add one test + update RELEASE.md.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Add missing pyodbc for tests.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Mock connection as well.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Add more dates parsing for mssql backend (thanks to fgaudindelrieu@idmog.com)

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Fix an error in docstring of MetricsDataSet (kedro-org#98)

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Bump relax pyarrow version to work the same way as Pandas (kedro-org#100)

* Bump relax pyarrow version to work the same way as Pandas

We only use PyArrow for `pandas.ParquetDataSet` as such I suggest we keep our versions pinned to the same range as [Pandas does](https://github.com/pandas-dev/pandas/blob/96fc51f5ec678394373e2c779ccff37ddb966e75/pyproject.toml#L100) for the same reason.

As such I suggest we remove the upper bound as we have users requesting later versions in [support channels](https://kedro-org.slack.com/archives/C03RKP2LW64/p1674040509133529)

* Updated release notes

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Add missing type in catalog example.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Add one more unit tests for adapt_mssql.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* [FIX] Add missing mocker from date test.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* [TEST] Add a wrong input test.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Add pyodbc dependency.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* [FIX] Remove dict() in tests.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Change check to check on plugin name (kedro-org#103)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Set coverage in pyproject.toml (kedro-org#105)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Move coverage settings to pyproject.toml (kedro-org#106)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Replace kedro.pipeline with modular_pipeline.pipeline factory (kedro-org#99)

* Add non-spark related test changes
Replace kedro.pipeline.Pipeline with
kedro.pipeline.modular_pipeline.pipeline factory.
This is for symmetry with changes made to the main kedro library.

Signed-off-by: Adam Farley <adamfrly@gmail.com>

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Fix outdated links in Kedro Datasets (kedro-org#111)

* fix links

* fix dill links

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Fix docs formatting and phrasing for some datasets (kedro-org#107)

* Fix docs formatting and phrasing for some datasets

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* Manually fix files not resolved with patch command

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* Apply fix from kedro-org#98

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

---------

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Release `kedro-datasets` `version 1.0.2` (kedro-org#112)

* bump version and update release notes

* fix pylint errors

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Bump pytest to 7.2 (kedro-org#113)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Prefix Docker plugin name with "Kedro-" in usage message (kedro-org#57)

* Prefix Docker plugin name with "Kedro-" in usage message

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Keep Kedro-Docker plugin docstring from appearing in `kedro -h` (kedro-org#56)

* Keep Kedro-Docker plugin docstring from appearing in `kedro -h`

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* [kedro-datasets ] Add `Polars.CSVDataSet` (kedro-org#95)

Signed-off-by: wmoreiraa <walber3@gmail.com>

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Remove deprecated `test_requires` from `setup.py` in Kedro-Docker (kedro-org#54)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* [FIX] Fix ds to data_set.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

---------

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>
Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>
Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Co-authored-by: Mariusz Strzelecki <szczeles@gmail.com>
Co-authored-by: Jannic <37243923+jmholzer@users.noreply.github.com>
Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Co-authored-by: OKA Naoya <pn11@users.noreply.github.com>
Co-authored-by: Joel <35801847+datajoely@users.noreply.github.com>
Co-authored-by: adamfrly <45516720+adamfrly@users.noreply.github.com>
Co-authored-by: Sajid Alam <90610031+SajidAlamQB@users.noreply.github.com>
Co-authored-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Co-authored-by: Walber Moreira <58264877+wmoreiraa@users.noreply.github.com>
Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
dannyrfar pushed a commit to dannyrfar/kedro-plugins that referenced this pull request Mar 21, 2023
Signed-off-by: wmoreiraa <walber3@gmail.com>

Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
dannyrfar pushed a commit to dannyrfar/kedro-plugins that referenced this pull request Mar 21, 2023
* [kedro-docker] Layers size optimization (kedro-org#92)

* [kedro-docker] Layers size optimization

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Adjust test requirements

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Skip coverage check on tests dir (some do not execute on Windows)

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Update .coveragerc with the setup

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Fix bandit so it does not scan kedro-datasets

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Fixed existence test

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Check why dir is not created

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Kedro starters are fixed now

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Increased no-output-timeout for long spark image build

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Spark image optimized

Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>

* Linting

Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>

* Switch to slim image always

Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>

* Trigger build

Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>

* Use textwrap.dedent for nicer indentation

Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>

* Revert "Use textwrap.dedent for nicer indentation"

This reverts commit 3a1e3f8.

Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>

* Revert "Revert "Use textwrap.dedent for nicer indentation""

This reverts commit d322d35.

Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>

* Make tests read more lines (to skip all deprecation warnings)

Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>
Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Release Kedro-Docker 0.3.1 (kedro-org#94)

* Add release notes for kedro-docker 0.3.1

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

* Update version in kedro_docker module

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Bump version and update release notes (kedro-org#96)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Make the SQLQueryDataSet compatible with mssql.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Add one test + update RELEASE.md.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Add missing pyodbc for tests.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Mock connection as well.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Add more dates parsing for mssql backend (thanks to fgaudindelrieu@idmog.com)

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Fix an error in docstring of MetricsDataSet (kedro-org#98)

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Bump relax pyarrow version to work the same way as Pandas (kedro-org#100)

* Bump relax pyarrow version to work the same way as Pandas

We only use PyArrow for `pandas.ParquetDataSet` as such I suggest we keep our versions pinned to the same range as [Pandas does](https://github.com/pandas-dev/pandas/blob/96fc51f5ec678394373e2c779ccff37ddb966e75/pyproject.toml#L100) for the same reason.

As such I suggest we remove the upper bound as we have users requesting later versions in [support channels](https://kedro-org.slack.com/archives/C03RKP2LW64/p1674040509133529)

* Updated release notes

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Add missing type in catalog example.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Add one more unit tests for adapt_mssql.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* [FIX] Add missing mocker from date test.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* [TEST] Add a wrong input test.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Add pyodbc dependency.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* [FIX] Remove dict() in tests.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Change check to check on plugin name (kedro-org#103)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Set coverage in pyproject.toml (kedro-org#105)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Move coverage settings to pyproject.toml (kedro-org#106)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Replace kedro.pipeline with modular_pipeline.pipeline factory (kedro-org#99)

* Add non-spark related test changes
Replace kedro.pipeline.Pipeline with
kedro.pipeline.modular_pipeline.pipeline factory.
This is for symmetry with changes made to the main kedro library.

Signed-off-by: Adam Farley <adamfrly@gmail.com>

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Fix outdated links in Kedro Datasets (kedro-org#111)

* fix links

* fix dill links

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Fix docs formatting and phrasing for some datasets (kedro-org#107)

* Fix docs formatting and phrasing for some datasets

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* Manually fix files not resolved with patch command

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* Apply fix from kedro-org#98

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

---------

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Release `kedro-datasets` `version 1.0.2` (kedro-org#112)

* bump version and update release notes

* fix pylint errors

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Bump pytest to 7.2 (kedro-org#113)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Prefix Docker plugin name with "Kedro-" in usage message (kedro-org#57)

* Prefix Docker plugin name with "Kedro-" in usage message

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Keep Kedro-Docker plugin docstring from appearing in `kedro -h` (kedro-org#56)

* Keep Kedro-Docker plugin docstring from appearing in `kedro -h`

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* [kedro-datasets ] Add `Polars.CSVDataSet` (kedro-org#95)

Signed-off-by: wmoreiraa <walber3@gmail.com>

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Remove deprecated `test_requires` from `setup.py` in Kedro-Docker (kedro-org#54)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* [FIX] Fix ds to data_set.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

---------

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>
Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>
Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Co-authored-by: Mariusz Strzelecki <szczeles@gmail.com>
Co-authored-by: Jannic <37243923+jmholzer@users.noreply.github.com>
Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Co-authored-by: OKA Naoya <pn11@users.noreply.github.com>
Co-authored-by: Joel <35801847+datajoely@users.noreply.github.com>
Co-authored-by: adamfrly <45516720+adamfrly@users.noreply.github.com>
Co-authored-by: Sajid Alam <90610031+SajidAlamQB@users.noreply.github.com>
Co-authored-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Co-authored-by: Walber Moreira <58264877+wmoreiraa@users.noreply.github.com>
Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
dannyrfar pushed a commit to dannyrfar/kedro-plugins that referenced this pull request Mar 21, 2023
Signed-off-by: wmoreiraa <walber3@gmail.com>

Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
dannyrfar pushed a commit to dannyrfar/kedro-plugins that referenced this pull request Mar 21, 2023
* [kedro-docker] Layers size optimization (kedro-org#92)

* [kedro-docker] Layers size optimization

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Adjust test requirements

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Skip coverage check on tests dir (some do not execute on Windows)

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Update .coveragerc with the setup

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Fix bandit so it does not scan kedro-datasets

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Fixed existence test

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Check why dir is not created

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Kedro starters are fixed now

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Increased no-output-timeout for long spark image build

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Spark image optimized

Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>

* Linting

Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>

* Switch to slim image always

Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>

* Trigger build

Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>

* Use textwrap.dedent for nicer indentation

Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>

* Revert "Use textwrap.dedent for nicer indentation"

This reverts commit 3a1e3f8.

Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>

* Revert "Revert "Use textwrap.dedent for nicer indentation""

This reverts commit d322d35.

Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>

* Make tests read more lines (to skip all deprecation warnings)

Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>
Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Release Kedro-Docker 0.3.1 (kedro-org#94)

* Add release notes for kedro-docker 0.3.1

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

* Update version in kedro_docker module

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Bump version and update release notes (kedro-org#96)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Make the SQLQueryDataSet compatible with mssql.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Add one test + update RELEASE.md.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Add missing pyodbc for tests.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Mock connection as well.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Add more dates parsing for mssql backend (thanks to fgaudindelrieu@idmog.com)

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Fix an error in docstring of MetricsDataSet (kedro-org#98)

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Bump relax pyarrow version to work the same way as Pandas (kedro-org#100)

* Bump relax pyarrow version to work the same way as Pandas

We only use PyArrow for `pandas.ParquetDataSet` as such I suggest we keep our versions pinned to the same range as [Pandas does](https://github.com/pandas-dev/pandas/blob/96fc51f5ec678394373e2c779ccff37ddb966e75/pyproject.toml#L100) for the same reason.

As such I suggest we remove the upper bound as we have users requesting later versions in [support channels](https://kedro-org.slack.com/archives/C03RKP2LW64/p1674040509133529)

* Updated release notes

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Add missing type in catalog example.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Add one more unit tests for adapt_mssql.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* [FIX] Add missing mocker from date test.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* [TEST] Add a wrong input test.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Add pyodbc dependency.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* [FIX] Remove dict() in tests.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Change check to check on plugin name (kedro-org#103)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Set coverage in pyproject.toml (kedro-org#105)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Move coverage settings to pyproject.toml (kedro-org#106)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Replace kedro.pipeline with modular_pipeline.pipeline factory (kedro-org#99)

* Add non-spark related test changes
Replace kedro.pipeline.Pipeline with
kedro.pipeline.modular_pipeline.pipeline factory.
This is for symmetry with changes made to the main kedro library.

Signed-off-by: Adam Farley <adamfrly@gmail.com>

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Fix outdated links in Kedro Datasets (kedro-org#111)

* fix links

* fix dill links

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Fix docs formatting and phrasing for some datasets (kedro-org#107)

* Fix docs formatting and phrasing for some datasets

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* Manually fix files not resolved with patch command

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* Apply fix from kedro-org#98

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

---------

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Release `kedro-datasets` `version 1.0.2` (kedro-org#112)

* bump version and update release notes

* fix pylint errors

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Bump pytest to 7.2 (kedro-org#113)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Prefix Docker plugin name with "Kedro-" in usage message (kedro-org#57)

* Prefix Docker plugin name with "Kedro-" in usage message

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Keep Kedro-Docker plugin docstring from appearing in `kedro -h` (kedro-org#56)

* Keep Kedro-Docker plugin docstring from appearing in `kedro -h`

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* [kedro-datasets ] Add `Polars.CSVDataSet` (kedro-org#95)

Signed-off-by: wmoreiraa <walber3@gmail.com>

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Remove deprecated `test_requires` from `setup.py` in Kedro-Docker (kedro-org#54)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* [FIX] Fix ds to data_set.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

---------

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>
Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>
Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Co-authored-by: Mariusz Strzelecki <szczeles@gmail.com>
Co-authored-by: Jannic <37243923+jmholzer@users.noreply.github.com>
Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Co-authored-by: OKA Naoya <pn11@users.noreply.github.com>
Co-authored-by: Joel <35801847+datajoely@users.noreply.github.com>
Co-authored-by: adamfrly <45516720+adamfrly@users.noreply.github.com>
Co-authored-by: Sajid Alam <90610031+SajidAlamQB@users.noreply.github.com>
Co-authored-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Co-authored-by: Walber Moreira <58264877+wmoreiraa@users.noreply.github.com>
Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
dannyrfar pushed a commit to dannyrfar/kedro-plugins that referenced this pull request Mar 21, 2023
Signed-off-by: wmoreiraa <walber3@gmail.com>
Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
dannyrfar pushed a commit to dannyrfar/kedro-plugins that referenced this pull request Mar 21, 2023
* [kedro-docker] Layers size optimization (kedro-org#92)

* [kedro-docker] Layers size optimization

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Adjust test requirements

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Skip coverage check on tests dir (some do not execute on Windows)

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Update .coveragerc with the setup

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Fix bandit so it does not scan kedro-datasets

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Fixed existence test

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Check why dir is not created

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Kedro starters are fixed now

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Increased no-output-timeout for long spark image build

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>

* Spark image optimized

Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>

* Linting

Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>

* Switch to slim image always

Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>

* Trigger build

Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>

* Use textwrap.dedent for nicer indentation

Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>

* Revert "Use textwrap.dedent for nicer indentation"

This reverts commit 3a1e3f8.

Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>

* Revert "Revert "Use textwrap.dedent for nicer indentation""

This reverts commit d322d35.

Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>

* Make tests read more lines (to skip all deprecation warnings)

Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>
Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Release Kedro-Docker 0.3.1 (kedro-org#94)

* Add release notes for kedro-docker 0.3.1

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

* Update version in kedro_docker module

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Bump version and update release notes (kedro-org#96)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Make the SQLQueryDataSet compatible with mssql.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Add one test + update RELEASE.md.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Add missing pyodbc for tests.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Mock connection as well.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Add more dates parsing for mssql backend (thanks to fgaudindelrieu@idmog.com)

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Fix an error in docstring of MetricsDataSet (kedro-org#98)

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Bump relax pyarrow version to work the same way as Pandas (kedro-org#100)

* Bump relax pyarrow version to work the same way as Pandas

We only use PyArrow for `pandas.ParquetDataSet` as such I suggest we keep our versions pinned to the same range as [Pandas does](https://github.com/pandas-dev/pandas/blob/96fc51f5ec678394373e2c779ccff37ddb966e75/pyproject.toml#L100) for the same reason.

As such I suggest we remove the upper bound as we have users requesting later versions in [support channels](https://kedro-org.slack.com/archives/C03RKP2LW64/p1674040509133529)

* Updated release notes

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Add missing type in catalog example.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Add one more unit tests for adapt_mssql.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* [FIX] Add missing mocker from date test.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* [TEST] Add a wrong input test.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Add pyodbc dependency.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* [FIX] Remove dict() in tests.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Change check to check on plugin name (kedro-org#103)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Set coverage in pyproject.toml (kedro-org#105)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Move coverage settings to pyproject.toml (kedro-org#106)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Replace kedro.pipeline with modular_pipeline.pipeline factory (kedro-org#99)

* Add non-spark related test changes
Replace kedro.pipeline.Pipeline with
kedro.pipeline.modular_pipeline.pipeline factory.
This is for symmetry with changes made to the main kedro library.

Signed-off-by: Adam Farley <adamfrly@gmail.com>

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Fix outdated links in Kedro Datasets (kedro-org#111)

* fix links

* fix dill links

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Fix docs formatting and phrasing for some datasets (kedro-org#107)

* Fix docs formatting and phrasing for some datasets

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* Manually fix files not resolved with patch command

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* Apply fix from kedro-org#98

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

---------

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Release `kedro-datasets` `version 1.0.2` (kedro-org#112)

* bump version and update release notes

* fix pylint errors

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Bump pytest to 7.2 (kedro-org#113)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Prefix Docker plugin name with "Kedro-" in usage message (kedro-org#57)

* Prefix Docker plugin name with "Kedro-" in usage message

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Keep Kedro-Docker plugin docstring from appearing in `kedro -h` (kedro-org#56)

* Keep Kedro-Docker plugin docstring from appearing in `kedro -h`

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* [kedro-datasets ] Add `Polars.CSVDataSet` (kedro-org#95)

Signed-off-by: wmoreiraa <walber3@gmail.com>

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* Remove deprecated `test_requires` from `setup.py` in Kedro-Docker (kedro-org#54)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>

* [FIX] Fix ds to data_set.

Signed-off-by: Yassine Alouini <yalouini@idmog.com>

---------

Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com>
Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com>
Signed-off-by: Yassine Alouini <yalouini@idmog.com>
Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Co-authored-by: Mariusz Strzelecki <szczeles@gmail.com>
Co-authored-by: Jannic <37243923+jmholzer@users.noreply.github.com>
Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Co-authored-by: OKA Naoya <pn11@users.noreply.github.com>
Co-authored-by: Joel <35801847+datajoely@users.noreply.github.com>
Co-authored-by: adamfrly <45516720+adamfrly@users.noreply.github.com>
Co-authored-by: Sajid Alam <90610031+SajidAlamQB@users.noreply.github.com>
Co-authored-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Co-authored-by: Walber Moreira <58264877+wmoreiraa@users.noreply.github.com>
Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
@wmoreiraa wmoreiraa mentioned this pull request Apr 1, 2023
3 tasks
@astrojuanlu
Copy link
Member

Most up to date effort: gh-170.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants