J'essaie de générer PMML (en utilisant jpmml-sklearn) pour le pipeline de classification de texte. La dernière ligne du code - sklearn2pmml (Textpipeline, "TextMiningClassifier.pmml", with_repr = True) - se bloque. Il semblerait que sklearn2pmml() ne soit pas capable d'utiliser Textpipeline en tant qu'entrée. Le code fonctionne bien pour les autres pipelines (exemples ici: https://github.com/jpmml/sklearn2pmml) mais pas pour le pipeline de classification de texte ci-dessus. Donc ma question est: comment puis-je générer PMML pour le problème de classification de texte?Générer PMML pour pipeline de classification de texte en python
erreur que je reçois:
Jun 15, 2017 12:48:00 PM org.jpmml.sklearn.Main run
INFO: Parsing PKL..
Jun 15, 2017 12:48:01 PM org.jpmml.sklearn.Main run
INFO: Parsed PKL in 489 ms.
Jun 15, 2017 12:48:01 PM org.jpmml.sklearn.Main run
INFO: Converting..
Jun 15, 2017 12:48:01 PM sklearn2pmml.PMMLPipeline encodePMML
WARNING: The 'target_field' attribute is not set. Assuming y as the name of the target field
Jun 15, 2017 12:48:01 PM sklearn2pmml.PMMLPipeline initFeatures
WARNING: The 'active_fields' attribute is not set. Assuming [x1] as the names of active fields
Jun 15, 2017 12:48:01 PM org.jpmml.sklearn.Main run
SEVERE: Failed to convert
java.lang.IllegalArgumentException: The tokenizer object (null) is not Splitter
at sklearn.feature_extraction.text.CountVectorizer.getTokenizer(CountVectorizer.java:263)
at sklearn.feature_extraction.text.CountVectorizer.encodeDefineFunction(CountVectorizer.java:164)
at sklearn.feature_extraction.text.CountVectorizer.encodeFeatures(CountVectorizer.java:124)
at sklearn.pipeline.Pipeline.encodeFeatures(Pipeline.java:93)
at sklearn2pmml.PMMLPipeline.encodePMML(PMMLPipeline.java:122)
at org.jpmml.sklearn.Main.run(Main.java:144)
at org.jpmml.sklearn.Main.main(Main.java:93)
Exception in thread "main" java.lang.IllegalArgumentException: The tokenizer object (null) is not Splitter
at sklearn.feature_extraction.text.CountVectorizer.getTokenizer(CountVectorizer.java:263)
at sklearn.feature_extraction.text.CountVectorizer.encodeDefineFunction(CountVectorizer.java:164)
at sklearn.feature_extraction.text.CountVectorizer.encodeFeatures(CountVectorizer.java:124)
at sklearn.pipeline.Pipeline.encodeFeatures(Pipeline.java:93)
at sklearn2pmml.PMMLPipeline.encodePMML(PMMLPipeline.java:122)
at org.jpmml.sklearn.Main.run(Main.java:144)
at org.jpmml.sklearn.Main.main(Main.java:93)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Data\Anaconda2\lib\site-packages\sklearn2pmml\__init__.py", line 142, in sklearn2pmml
raise RuntimeError("The JPMML-SkLearn conversion application has failed. The Java process should have printed more information about the failure into its standard output and/or error streams")
RuntimeError: The JPMML-SkLearn conversion application has failed. The Java process should have printed more information about the failure into its standard output and/or error streams