-
Notifications
You must be signed in to change notification settings - Fork 912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add PySpark to the add-ons flow #3169
Conversation
Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com>
Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com>
Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com>
Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com>
Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com>
Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The general approach looks good, but instead of adding the spark
specific files in the basic template, these should be fetched from the new starter: https://github.com/kedro-org/kedro-starters/tree/main/spaceflights-pyspark. You might have to also create a post_gen_project
file there..
Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com>
Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com>
Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com>
Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com>
Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com>
Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com>
Signed-off-by: Sajid Alam <90610031+SajidAlamQB@users.noreply.github.com>
Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com>
Signed-off-by: Nok Lam Chan <nok.lam.chan@quantumblack.com>
I see all tests are passing but linting, I take the liberty to fix it. |
lol we did it at the same time. |
If you combine Not related to just pyspark: I also noticed that |
Yeah part of the stuff was stripped from the |
We could just add checks to make sure that the entries we're adding are not already present. |
But these optional requirements should only be added if the user wants them. If they haven't selected |
Our tests should have caught this - I will make a separate PR to fix the test. Essentially I think we need to parse the pyproject.toml properly instead of reading it as text file.
|
I created the test in #3230, it catches an invalid combination that we need to fix. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left some minor comments but I think this is almost ready to be merged. I suggest fixing the other issues I found in a separate PR, because they aren't caused by the changes here and that shouldn't block the work for adding the viz
option.
Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com>
Oh, I've already added some changes that attempt to fix this but I can revert and open it up on new PR? @merelcht |
Ah no, that's great!! I'll have a look 😄 I just didn't want to block this PR by my earlier comments. |
Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com>
Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com>
This reverts commit 343e805.
Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com>
Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com> Signed-off-by: Sajid Alam <90610031+SajidAlamQB@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left one more comment about the pyspark
check. add_ons
is a list there so ==
will always fail. But otherwise this looks good to be merged! 👍
Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com>
Description
PySpark should be added to the add-ons options, allowing users to get setup for pyspark (see #2506 (comment)). This option should make use of the new spaceflights-pyspark starter: #2984
When a user select pyspark they should get the config needed to run Kedro project with pyspark. This shouldn't include the example pipelines. This will happen in a follow up ticket.
Development notes
Blocked by:
kedro-org/kedro-starters#170
Checklist
RELEASE.md
file