-
Notifications
You must be signed in to change notification settings - Fork 909
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add _SINGLE_PROCESS property to CachedDataSet #1905
Conversation
83f784e
to
a47b514
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your contribution! 😄 I've fixed the linter in another branch, so it should pass now.
I've left one minor suggestion and then it can be merged!
kedro/io/cached_dataset.py
Outdated
# for parallelism within a Spark pipeline please consider | ||
# ``ThreadRunner`` instead |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can remove these two sentences, because for the CachedDataSet
this is not related to Spark in any way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure! Thanks for the suggestion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MerelTheisenQB may we could keep the suggestion to use ThreadRunner
?
# for parallelism please consider ``ThreadRunner`` instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Congratulations on your first PR! 🎉 Great work.
For future reference, working on a branch on kedro-org/kedro repo is totally fine and is the way we normally do it. It simplifies the workflow by quite a bit 🙂.
Thanks @jmholzer! I've followed the process in the contribution guidelines. Next time, I'll create a branch direct on |
Ohh I see 😃 I did the same for my first PR. Thanks for reminding us, let me see about creating an issue to update our contributor guidelines. |
a341e89
to
05ffe6a
Compare
Signed-off-by: Carla Vieira <carlaprv@hotmail.com> Signed-off-by: carlaprv <carlaprv@hotmail.com>
Signed-off-by: carlaprv <carlaprv@hotmail.com>
05ffe6a
to
28f9ee8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again for the contribution! ⭐ ⭐ ⭐
Signed-off-by: Carla Vieira <carlaprv@hotmail.com> Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Carla Vieira <carlaprv@hotmail.com> Signed-off-by: nickolasrm <nickolasrochamachado@gmail.com>
* Release/0.18.3 (#1856) * Update release version and release notes Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> * Update missing release notes Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> * update vresion Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> * update release notes Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Remove comment from code example Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Remove more comments Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Add YAML formatting Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Add missing import Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Remove even more comments Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Remove more even more comments Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Add pickle requirement to extras_require Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Try fix YAML docs Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Try fix YAML docs pt 2 Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Fix code snippets in docs (#1876) * Fix code snippets Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Separate code blocks Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Lint Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Fix issue with specifying format for SparkHiveDataSet (#1857) Signed-off-by: jstammers <jimmy.stammers@cgastrategy.com> Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Update RELEASE.md (#1883) * Update RELEASE.md * fix broken link * Update RELEASE.md Co-authored-by: Merel Theisen <49397448+MerelTheisenQB@users.noreply.github.com> Co-authored-by: Merel Theisen <49397448+MerelTheisenQB@users.noreply.github.com> Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Deprecate `kedro test` and `kedro lint` (#1873) * Deprecating `kedro test` and `kedro lint` Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> * Deprecate commands Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> * Make kedro looks prettier * Update Linting Signed-off-by: Nok <nok_lam_chan@mckinsey.com> Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> Signed-off-by: Nok <nok_lam_chan@mckinsey.com> Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Fix micro package pull from PyPI (#1848) Signed-off-by: Florian Gaudin-Delrieu <florian.gaudindelrieu@gmail.com> Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Update Error message for `VersionNotFoundError` to handle Permission related issues better (#1881) * Update message for VersionNotFoundError Signed-off-by: Ankita Katiyar <110245118+ankatiyar@users.noreply.github.com> * Add test for VersionNotFoundError for cloud protocols * Update test_data_catalog.py Update NoVersionFoundError test * minor linting update * update docs link + styling changes * Revert "update docs link + styling changes" This reverts commit 6088e00. * Update test with styling changes * Update RELEASE.md Signed-off-by: ankatiyar <ankitakatiyar2401@gmail.com> Signed-off-by: Ankita Katiyar <110245118+ankatiyar@users.noreply.github.com> Signed-off-by: ankatiyar <ankitakatiyar2401@gmail.com> Co-authored-by: Ahdra Merali <90615669+AhdraMeraliQB@users.noreply.github.com> Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Update experiment tracking documentation with working examples (#1893) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Add NHS AI Lab and ReSpo.Vision to companies list (#1878) Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Document how users can use pytest instead of kedro test (#1879) * Add best_practices.md with introductory sections Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Add pytest and pytest-cov sections Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Add pytest-cov coverage report Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Add sections on pytest-cov Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Add automated_testing to index.rst Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Reformat third-party library names and clean grammar. Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Add link to virtual environment docs Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Add example of good test naming Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Improve link accessibility Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Improve pytest docs link accessibility Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Add reminder link to virtual environment docs Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Fix formatting in link to coverage docs Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Remove reference to /src under 'Run your tests' Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Modify references to <project_name> to <package_name> Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Fix sentence structure Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Fix broken databricks doc link Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Capitalise Kedro-Viz in the "Visualize layers" section (#1899) * Capitalised kedro-viz Signed-off-by: yash6318 <yash.agrawal.cse21@iitbhu.ac.in> * capitalised Kedro viz Signed-off-by: yash6318 <yash.agrawal.cse21@iitbhu.ac.in> * Updated set_up_experiment_tracking.md Co-authored-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: yash6318 <yash.agrawal.cse21@iitbhu.ac.in> Signed-off-by: yash6318 <yash.agrawal.cse21@iitbhu.ac.in> Co-authored-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Fix linting on autmated test page (#1906) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Add _SINGLE_PROCESS property to CachedDataSet (#1905) Signed-off-by: Carla Vieira <carlaprv@hotmail.com> Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Update the tutorial of "Visualise pipelines" (#1913) * Change a file extention to match the previous article Signed-off-by: dinotuku <kuan.tung@epfl.ch> * Add a missing import Signed-off-by: dinotuku <kuan.tung@epfl.ch> * Change both preprocessed datasets to parquet files Signed-off-by: dinotuku <kuan.tung@epfl.ch> * Change data type to ParquetDataSet for parquet files Signed-off-by: dinotuku <kuan.tung@epfl.ch> * Add a note for installing seaborn if it is not installed Signed-off-by: dinotuku <kuan.tung@epfl.ch> Signed-off-by: dinotuku <kuan.tung@epfl.ch> Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Document how users can use linting tools instead of `kedro lint` (#1904) * Add documentation for linting tools Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Revert changes to commands_reference.md Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Update linting docs with suggestions Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Update linting doc Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Make core config accessible in dict get way (#1870) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Create dependabot.yml configuration file for version updates (#1862) * Create dependabot.yml configuration file * Update dependabot.yml Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com> * add target-branch Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com> * Update dependabot.yml Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com> * limit dependabot to just dependency folder Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com> * Update test_requirements.txt Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com> * Update MANIFEST.in Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com> * fix e2e Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com> * Update continue_config.yml Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com> * Update requirements.txt Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com> * Update requirements.txt Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com> * fix link Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com> * revert Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com> * Delete requirements.txt Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com> Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com> Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Update dependabot config (#1928) Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Update robots.txt (#1929) Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * fix broken link (#1950) Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Update dependabot.yml config (#1938) * Update dependabot.yml Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com> * pin jupyterlab_services to requirments Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com> * lint Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com> Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com> Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Update setup.py Jinja2 dependencies (#1954) Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Update pip-tools requirement from ~=6.5 to ~=6.9 in /dependency (#1957) Updates the requirements on [pip-tools](https://github.com/jazzband/pip-tools) to permit the latest version. - [Release notes](https://github.com/jazzband/pip-tools/releases) - [Changelog](https://github.com/jazzband/pip-tools/blob/master/CHANGELOG.md) - [Commits](jazzband/pip-tools@6.5.0...6.9.0) --- updated-dependencies: - dependency-name: pip-tools dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Update toposort requirement from ~=1.5 to ~=1.7 in /dependency (#1956) Updates the requirements on [toposort]() to permit the latest version. --- updated-dependencies: - dependency-name: toposort dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Sajid Alam <90610031+SajidAlamQB@users.noreply.github.com> Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Add deprecation warning to package_name argument in session create() (#1953) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Remove redundant `resolve_load_version` call (#1911) * remove a redundant function call Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> * Remove redundant resolove_load_version & fix test Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> * Fix HoloviewWriter tests with more specific error message pattern & Lint Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> * Rename tests Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Make docstring in test starter match real starters (#1916) Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Try to fix formatting error Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> * Specify pickle import Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> Signed-off-by: jstammers <jimmy.stammers@cgastrategy.com> Signed-off-by: Nok <nok_lam_chan@mckinsey.com> Signed-off-by: Florian Gaudin-Delrieu <florian.gaudindelrieu@gmail.com> Signed-off-by: Ankita Katiyar <110245118+ankatiyar@users.noreply.github.com> Signed-off-by: ankatiyar <ankitakatiyar2401@gmail.com> Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> Signed-off-by: yash6318 <yash.agrawal.cse21@iitbhu.ac.in> Signed-off-by: Carla Vieira <carlaprv@hotmail.com> Signed-off-by: dinotuku <kuan.tung@epfl.ch> Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: Nok <mediumnok@gmail.com> Co-authored-by: Jimmy Stammers <jimmy.stammers@gmail.com> Co-authored-by: Merel Theisen <49397448+MerelTheisenQB@users.noreply.github.com> Co-authored-by: Florian Gaudin-Delrieu <9217921+FlorianGD@users.noreply.github.com> Co-authored-by: Ankita Katiyar <110245118+ankatiyar@users.noreply.github.com> Co-authored-by: Yetunde Dada <43755008+yetudada@users.noreply.github.com> Co-authored-by: Jannic <37243923+jmholzer@users.noreply.github.com> Co-authored-by: Yash Agrawal <96697569+yash6318@users.noreply.github.com> Co-authored-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Co-authored-by: Carla Vieira <carlaprv@hotmail.com> Co-authored-by: Kuan Tung <kuan.tung@epfl.ch> Co-authored-by: Sajid Alam <90610031+SajidAlamQB@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com> Co-authored-by: Merel Theisen <merel.theisen@quantumblack.com>
Description
Solves #1888
Development notes
The CachedDataSet cannot be used with the ParellelRunner this PR adds the
_SINGLE_PROCESS
property just like in DeltaTableDataSetBefore, this PR trying to use CachedDataSet and ParallelRunner together was failing.
Checklist
RELEASE.md
file