-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
serializeToBundle object issue #8
Comments
@drkmd8 Can you give us version information for both the Python and MLeap JVM packages as well as Spark that you are using? |
I used Python 3.6.1, Mleap 0.8.1 (pip install mleap) and pyspark 2.1.1+hadoop2.7. This problem was caused by python mleap library I think, because scala seems to work fine but python requires running with external jar files where mleap classes are included. |
I have the same issue at the moment as well. |
Yes, I have the same issue, my solution is add the jar file to the pyspark jars dir which at the python package path:site-packages/pyspark/jars/ . I hope it is helpful for you. |
I also solved it by adding the jars manually to /usr/lib/spark/jars. |
Hi @alexkayal & @tianhongjie, I have tried your solution, it fixed the
I am not sure how it can happen, since my dataframe only has primitive types |
@alexkayal @tianhongjie @Khiem-Tran ,
|
I fixed the |
@elgalu , Thank you so much for your prompt response. |
Make sure is in the CLASSPATH, note Py4J has its own jars/ folder. And if you install pyspark separately it also comes with its own jars/ folder. What I do is remove all those jars directories and symlink to 1 /jars where a put together the whole set of working versions. You can find all my working jars at: Pending: to build an sbt or pom.xml project (instead of a bunch of jars) |
I am same to you, have you find an answer? |
@elgalu @siyouhe666, I have been using spark 2.2.1 and the config Btw, I also follow this (blog)[https://medium.com/@bogdan.cojocar/pyspark-and-xgboost-integration-tested-on-the-kaggle-titanic-dataset-4e75a568bdb] to fix xgboost dependency because somehow my mleap-xgboost does not work properly |
Thanks I have solved this problem through changed my spark version to 2.4.0, |
i am using python 3.6 , pyspark 2.3.1. when calling |
@yairdata I am same to you, have you find an answer? Thanks a lot. |
@yairdata I solved this problem by adjust the version of MLeap. Original I used 0.13.0, now I uesd 0.11.0, but raise another problem: |
@SoloBean - i have solved this problem with 0.13.0 by specifying spark.jars.packages to point to ml.combust.mleap:mleap-spark-base_2.11:0.13.0,ml.combust.mleap:mleap-spark_2.11:0.13.0 . |
@yairdata - I also solved this problem by add jars to /jars, but after I add all jars as I know, there raise another problem that I don't know how to solved this by add jar:
|
@SoloBean - i think there is some issue open for that about dependency conflict, not sure. |
weird jar dependency issue: |
Took some time to figure it out, and hence putting the steps to resolve it below.
The most straightforward way to circumvent both the above issues is to invoke pyspark via the below: This issue does not seem to be an issue and can be closed by admins. But I wonder, as to the reason of not publishing newer versions of mleap to PyPy. |
@SoloBean Meeting with the same
|
I have the same issue and add jars then: My version: |
Hello @hollinwilkins. Even i have the Same Issue and I added jars and i am also facing same issue ERROR:root:Exception while sending command. During handling of the above exception, another exception occurred: Traceback (most recent call last): Python 3 Please Help me out of this Thanks |
works with mleap 0.13.0 version
|
hi @yairdata how did you manage to find out which version of the jar file is the compatible one? |
@y-tee - alot of trial & error ...i wish it was documented somewhere...since it wasn't i pasted it here to help others. |
@yairdata did you try all the versions 😱 |
@y-tee not all versions , there are compatible jar versions , but not all of them are listed as dependencies, so this is trial & error. |
I've release the python mleap version 0.15.0 just today, fyi https://pypi.org/project/mleap/#history, please let me know if you see any issues. |
My mleap is 0.15.0, and Spark is 2.4.4, I'm having this issue again.
Error:
|
I am also have problem with mleap 0.15.0 and Spark 2.4.4.
The error from the notebook
|
@felixgao have you fixed this problem? I am using the same versions and facing the same problem. |
I agree with others that this is a tricky dependency problem, not a problem with MLeap per se. Here is how I solved it on my MacBook:
My PySpark version is 2.4.5 (see the MLeap Github page for what version of MLeap works with what version of Spark). When I first ran spark-submit, I got a further error that Spark could not download some additional dependencies: First, Then, use maven from the command line to download dependencies. Here are the three I needed:
If you need different jars, you can find the coordinates by searching mvnrepository.com in your browser. |
Hi, I am trying to build an AWS Sagemaker model which includes Spark pipeline model for feature transformation. When I use mleap inside my docker container for serializing the pipelinemodel I am getting similar exception. I am not very sure how can I use all these mleap jars into my docker container? Can anyone help me to get around this? |
Same issue here running pyspark 2.4.3 and mleap 0.17.0. I tried two things: Adding all jar files manually to the jars folder in pyspark:
And running with the spark submit:
Neither method worked |
got the same issue. When running the code from the tutorial,
Any suggestions pls? also, tried to install mleap from source, followed the instructions, but got this error. [error] (mleap-core/compile:compileIncremental) javac returned nonzero exit code |
If you are using mleap 0.21.1 should serializeToBundle work? I am getting an error as below. Is the only option to go down? pyspark is 3.1.3. This is after resolving several other issues.
I make a spark context like this: `def gen_spark_session(): spark = gen_spark_session()` UPDATE: I was on Java 8 and apparently 0.21.1 is no good, it needs Java 11. I moved to 0.20.0 But, I still get this issue. I'm on Scala 2.12. |
If running the code as above, there's an issue with featurePipeline.serializeToBundle("jar:file:/tmp/pyspark.example.zip"). AttributeError: 'Pipeline' object has no attribute 'serializeToBundle'.
If use the following code:
featurePipeline2 = featurePipeline.fit(df2) featurePipeline2.serializeToBundle("jar:file:/tmp/pyspark.example.zip")
There is an error with self._java_obj = _jvm().ml.combust.mleap.spark.SimpleSparkSerializer(), saying "TypeError: 'JavaPackage' object is not callable"
How to solve it?
The text was updated successfully, but these errors were encountered: