-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
onnxmltools convert_sparkml bug: AttributeError: 'SparkSession' object has no attribute 'util'` to convert spark tree-based models to onnx. #547
Comments
Do you have reproduce code? If so I can help take a look. |
@xadupre please close this issue. |
@memoryz , thanks for your response and I am not notified. However, this ticket is not solved on my side. So can you first help me understand 1) spark._jvm.org.apache.spark.util.Utils.createTempDir(local_dir, "onnx").getAbsolutePath(), which version of spark SparkSession has util attribute 2) the code that raised this issue is def serialize_sparkml_to_onnx(
spark,
sparkml_model,
model_name,
onnx_file_path=None,
batch_size=None
):
initial_types = [
("features", FloatTensorType([batch_size, sparkml_model.numFeatures]))
]
model_name = model_name
onnx_model = convert_sparkml(
sparkml_model,
name=model_name,
initial_types=initial_types,
spark_session=spark
)
onnx_bytes = onnx_model.SerializeToString()
if onnx_file_path:
with open(onnx_file_path, "wb") as f:
f.write(onnx_bytes)
return onnx_bytes the above code has |
I installed ONNX 1.10.0, ONNX Runtime 1.10.0 and ONNXMLTools 1.10.0
I converted a spark tree-based model classifier (like GBTClassifier) model (trained in Python 3.7) to an ONNX model by onnxmltools.convert.convert_sparkml(). Which gives me the following bug
File "/usr/local/lib/python3.7/site-packages/onnxmltools/convert/main.py", line 167, in convert_sparkml custom_conversion_functions, custom_shape_calculators, spark_session) File "/usr/local/lib/python3.7/site-packages/onnxmltools/convert/sparkml/convert.py", line 71, in convert onnx_model = convert_topology(topology, name, doc_string, target_opset, targeted_onnx) File "/usr/local/lib/python3.7/site-packages/onnxconverter_common/topology.py", line 776, in convert_topology get_converter(operator.type)(scope, operator, container) File "/usr/local/lib/python3.7/site-packages/onnxmltools/convert/sparkml/operator_converters/gbt_classifier.py", line 26, in convert_gbt_classifier convert_decision_tree_regressor(scope, regressor_op, container) File "/usr/local/lib/python3.7/site-packages/onnxmltools/convert/sparkml/operator_converters/decision_tree_regressor.py", line 20, in convert_decision_tree_regressor tree_df = save_read_sparkml_model_data(operator.raw_params['SparkSession'], op) File "/usr/local/lib/python3.7/site-packages/onnxmltools/convert/sparkml/operator_converters/tree_ensemble_common.py", line 37, in save_read_sparkml_model_data tdir = spark.util.Utils.createTempDir().getAbsolutePath() AttributeError: 'SparkSession' object has no attribute 'util'
I trace the code execution flow not that:
`
def _parse_sparkml_simple_model(spark, scope, model, global_inputs, output_dict):
`
this_operator.raw_params = {'SparkSession': spark} the value is SparkSession instance,
so when the code reaches to save_read_sparkml_model_data, where SparkSession instance does not have util attribute.
Meanwhile, the line 44 at https://github.com/onnx/onnxmltools/blob/main/onnxmltools/convert/sparkml/operator_converters/tree_ensemble_common.py indicates it is SparkSession instance bc it can be used to create to spark.sql.DataFrame. but at line 44, SparkSession instance does not have util attr or method.
Can anyone know what happened here or it is a bug, so very appreciated for any help.
The text was updated successfully, but these errors were encountered: