You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, when attempting to convert a tfidfvectorizer as part of an sklearn pipeline, the following error is returned:
Traceback (most recent call last):
File "/app/test.py", line 47, in <module>
onnxModelPipeline = convert_sklearn(modelPipeline, "tfidf", initial_types=[("input", StringTensorType([None, 1]))], target_opset=12)
File "/usr/local/lib/python3.7/site-packages/skl2onnx/convert.py", line 190, in convert_sklearn
remove_identity=model_optim and not intermediate, verbose=verbose)
File "/usr/local/lib/python3.7/site-packages/skl2onnx/common/_topology.py", line 1420, in convert_topology
topology.convert_operators(container=container, verbose=verbose)
File "/usr/local/lib/python3.7/site-packages/skl2onnx/common/_topology.py", line 1255, in convert_operators
self.call_converter(operator, container, verbose=verbose)
File "/usr/local/lib/python3.7/site-packages/skl2onnx/common/_topology.py", line 1061, in call_converter
conv(self.scopes[0], operator, container)
File "/usr/local/lib/python3.7/site-packages/skl2onnx/common/_registration.py", line 26, in __call__
return self._fct(*args)
File "/usr/local/lib/python3.7/site-packages/skl2onnx/operator_converters/tfidf_transformer.py", line 48, in convert_sklearn_tfidf_transformer
"ONNX does not support sparse tensors before opset < 11, "
RuntimeError: ONNX does not support sparse tensors before opset < 11, sublinear_tf must be False.
The tfidfVectorizer is a pickled object created from sklearn 0.20.2, which does have sublinear_tf set to true. However, I've explictly set the opset to 12 in the code that's causing the above error, as follows:
with open("app/tidf_supervised_lemmatized_model/model.pkl.gzip", "rb") as f:
model = pickle.load(gzip.decompress(f.read()))
with open("app/tidf_supervised_lemmatized_model/tfidf_vector.pkl.gzip", "rb") as f:
tfidf = pickle.load(gzip.decompress(f.read()))
modelPipeline = Pipeline([('tfidfVectorizer', tfidf), ('model', model)])
onnxModelPipeline = convert_sklearn(modelPipeline, "tfidf", initial_types=[("input", StringTensorType([None, 2]))], target_opset=12)
Looking at the relevant source code, it seems the expectation is that opsets are set less than 11 rather than above 11 for sublinear_tf to work. If this is supposed to be the case, can the RuntimeError phrasing please be changed from "Onnx does not support sparse tensors before opset <11, sublinear_tf must be False" to "Onnx does not support sparse tensors after opset 11, sublinear_tf must be False".
if operator.target_opset < 11:
plus1 = scope.get_unique_variable_name("plus1")
C = operator.inputs[0].type.shape[1]
ones = scope.get_unique_variable_name("ones")
cst = np.ones((C,), dtype=float_type)
container.add_initializer(ones, proto_dtype, [C], cst.flatten())
apply_add(scope, data + [ones], plus1, container, broadcast=1)
plus1logged = scope.get_unique_variable_name("plus1logged")
apply_log(scope, plus1, plus1logged, container)
data = [plus1logged]
else:
# sparse containers have not yet been implemented.
raise RuntimeError(
"ONNX does not support sparse tensors before opset < 11, "
"sublinear_tf must be False.")
Thanks
The text was updated successfully, but these errors were encountered:
Hi, when attempting to convert a tfidfvectorizer as part of an sklearn pipeline, the following error is returned:
The tfidfVectorizer is a pickled object created from sklearn 0.20.2, which does have sublinear_tf set to true. However, I've explictly set the opset to 12 in the code that's causing the above error, as follows:
Looking at the relevant source code, it seems the expectation is that opsets are set less than 11 rather than above 11 for sublinear_tf to work. If this is supposed to be the case, can the RuntimeError phrasing please be changed from "Onnx does not support sparse tensors before opset <11, sublinear_tf must be False" to "Onnx does not support sparse tensors after opset 11, sublinear_tf must be False".
Thanks
The text was updated successfully, but these errors were encountered: