-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bringing Scala DataFrames into PySpark #209
Comments
Got DataFrames working on PyTorch!
Loading various things:
Actually running analyses:
|
One more thing you need to do in order to get the above work: |
@ianmilligan1 Check out branch Start PySpark as follows:
Then you should be able to do the following:
I've tried to make it as simple as possible... give it a whirl and let me know! |
Do you like the experience? I have a few more minor tweaks based on more standard Python conventions, but if it looks good, then I'll send a PR and let's get this merged in. |
Yeah, this is great. Sounds perfect! |
See #214 for PR. The PR includes proper refactoring into Python modules and integration into Maven to create the "deploy" zip. You'll need to run Maven to build the deploy zip:
Then start up PySpark as follows with PySpark AUT:
Then the following should work:
Note that |
@greebie
How to connect the existing DF code in Scala to PySpark:
https://stackoverflow.com/questions/36023860/how-to-use-a-scala-class-inside-pyspark
The text was updated successfully, but these errors were encountered: