You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When using synapse.ml.cognitive import Detect for detecting languages in a field in a pyspark dataframe using the Azure AI Services Language service .transform throws a Py4JJavaError on the transform execution
Code to reproduce issue
from synapse.ml.cognitive import Detect
from pyspark.sql.functions import col, flatten
# Create a Dataframe to Detect Language & Translate against
df_sentences = spark.createDataFrame([
("ヒョンデ", "ja")
], ["text", "expected_lang"])
detect = (Detect()
.setSubscriptionKey(cognitive_services_key)
.setLocation(cognitive_services_region)
.setTextCol("text")
.setOutputCol("result"))
display(detect
.transform(df_sentences)
.withColumn("language", col("result.language"))
.select("language"))
Other info / logs
Py4JJavaError: An error occurred while calling o435.transform.
: java.lang.NoSuchMethodError: org.apache.spark.sql.catalyst.encoders.RowEncoder$.apply(Lorg/apache/spark/sql/types/StructType;)Lorg/apache/spark/sql/catalyst/encoders/ExpressionEncoder;
at com.microsoft.azure.synapse.ml.core.schema.SparkBindings.rowEnc$lzycompute(SparkBindings.scala:17)
at com.microsoft.azure.synapse.ml.core.schema.SparkBindings.rowEnc(SparkBindings.scala:17)
at com.microsoft.azure.synapse.ml.core.schema.SparkBindings.makeFromRowConverter(SparkBindings.scala:26)
at com.microsoft.azure.synapse.ml.io.http.ErrorUtils$.addErrorUDF(SimpleHTTPTransformer.scala:57)
at com.microsoft.azure.synapse.ml.io.http.SimpleHTTPTransformer.$anonfun$makePipeline$1(SimpleHTTPTransformer.scala:135)
at org.apache.spark.injections.UDFUtils$$anon$1.call(UDFUtils.scala:23)
at org.apache.spark.sql.functions$.$anonfun$udf$91(functions.scala:8103)
at com.microsoft.azure.synapse.ml.stages.Lambda.$anonfun$transform$1(Lambda.scala:55)
at com.microsoft.azure.synapse.ml.logging.BasicLogging.logVerb(BasicLogging.scala:62)
at com.microsoft.azure.synapse.ml.logging.BasicLogging.logVerb$(BasicLogging.scala:59)
at com.microsoft.azure.synapse.ml.stages.Lambda.logVerb(Lambda.scala:24)
at com.microsoft.azure.synapse.ml.logging.BasicLogging.logTransform(BasicLogging.scala:52)
at com.microsoft.azure.synapse.ml.logging.BasicLogging.logTransform$(BasicLogging.scala:51)
at com.microsoft.azure.synapse.ml.stages.Lambda.logTransform(Lambda.scala:24)
at com.microsoft.azure.synapse.ml.stages.Lambda.transform(Lambda.scala:55)
at com.microsoft.azure.synapse.ml.stages.Lambda.transformSchema(Lambda.scala:63)
at org.apache.spark.ml.PipelineModel.$anonfun$transformSchema$5(Pipeline.scala:317)
at scala.collection.IndexedSeqOptimized.foldLeft(IndexedSeqOptimized.scala:60)
at scala.collection.IndexedSeqOptimized.foldLeft$(IndexedSeqOptimized.scala:68)
at scala.collection.mutable.ArrayOps$ofRef.foldLeft(ArrayOps.scala:198)
at org.apache.spark.ml.PipelineModel.transformSchema(Pipeline.scala:317)
at com.microsoft.azure.synapse.ml.io.http.SimpleHTTPTransformer.transformSchema(SimpleHTTPTransformer.scala:169)
at org.apache.spark.ml.PipelineModel.$anonfun$transformSchema$5(Pipeline.scala:317)
at scala.collection.IndexedSeqOptimized.foldLeft(IndexedSeqOptimized.scala:60)
at scala.collection.IndexedSeqOptimized.foldLeft$(IndexedSeqOptimized.scala:68)
at scala.collection.mutable.ArrayOps$ofRef.foldLeft(ArrayOps.scala:198)
at org.apache.spark.ml.PipelineModel.transformSchema(Pipeline.scala:317)
at org.apache.spark.ml.PipelineStage.transformSchema(Pipeline.scala:72)
at org.apache.spark.ml.PipelineModel.$anonfun$transform$2(Pipeline.scala:310)
at org.apache.spark.ml.MLEvents.withTransformEvent(events.scala:148)
at org.apache.spark.ml.MLEvents.withTransformEvent$(events.scala:141)
at org.apache.spark.ml.util.Instrumentation.withTransformEvent(Instrumentation.scala:45)
at org.apache.spark.ml.PipelineModel.$anonfun$transform$1(Pipeline.scala:309)
at org.apache.spark.ml.util.Instrumentation$.$anonfun$instrumented$1(Instrumentation.scala:289)
at scala.util.Try$.apply(Try.scala:213)
at org.apache.spark.ml.util.Instrumentation$.instrumented(Instrumentation.scala:289)
at org.apache.spark.ml.PipelineModel.transform(Pipeline.scala:308)
at com.microsoft.azure.synapse.ml.cognitive.CognitiveServicesBaseNoHandler.$anonfun$transform$1(CognitiveServiceBase.scala:358)
at com.microsoft.azure.synapse.ml.logging.BasicLogging.logVerb(BasicLogging.scala:62)
at com.microsoft.azure.synapse.ml.logging.BasicLogging.logVerb$(BasicLogging.scala:59)
at com.microsoft.azure.synapse.ml.cognitive.CognitiveServicesBaseNoHandler.logVerb(CognitiveServiceBase.scala:306)
at com.microsoft.azure.synapse.ml.logging.BasicLogging.logTransform(BasicLogging.scala:52)
at com.microsoft.azure.synapse.ml.logging.BasicLogging.logTransform$(BasicLogging.scala:51)
at com.microsoft.azure.synapse.ml.cognitive.CognitiveServicesBaseNoHandler.logTransform(CognitiveServiceBase.scala:306)
at com.microsoft.azure.synapse.ml.cognitive.CognitiveServicesBaseNoHandler.transform(CognitiveServiceBase.scala:358)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:397)
at py4j.Gateway.invoke(Gateway.java:306)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:195)
at py4j.ClientServerConnection.run(ClientServerConnection.java:115)
at java.lang.Thread.run(Thread.java:750)
What component(s) does this bug affect?
area/cognitive: Cognitive project
area/core: Core project
area/deep-learning: DeepLearning project
area/lightgbm: Lightgbm project
area/opencv: Opencv project
area/vw: VW project
area/website: Website
area/build: Project build system
area/notebooks: Samples under notebooks folder
area/docker: Docker usage
area/models: models related issue
What language(s) does this bug affect?
language/scala: Scala source code
language/python: Pyspark APIs
language/r: R APIs
language/csharp: .NET APIs
language/new: Proposals for new client languages
What integration(s) does this bug affect?
integrations/synapse: Azure Synapse integrations
integrations/azureml: Azure ML integrations
integrations/databricks: Databricks integrations
The text was updated successfully, but these errors were encountered:
SynapseML version
synapseml_2.12:0.10.0
System information
Describe the problem
When using
synapse.ml.cognitive import Detect
for detecting languages in a field in a pyspark dataframe using the Azure AI Services Language service.transform
throws a Py4JJavaError on the transform executionCode to reproduce issue
Other info / logs
What component(s) does this bug affect?
area/cognitive
: Cognitive projectarea/core
: Core projectarea/deep-learning
: DeepLearning projectarea/lightgbm
: Lightgbm projectarea/opencv
: Opencv projectarea/vw
: VW projectarea/website
: Websitearea/build
: Project build systemarea/notebooks
: Samples under notebooks folderarea/docker
: Docker usagearea/models
: models related issueWhat language(s) does this bug affect?
language/scala
: Scala source codelanguage/python
: Pyspark APIslanguage/r
: R APIslanguage/csharp
: .NET APIslanguage/new
: Proposals for new client languagesWhat integration(s) does this bug affect?
integrations/synapse
: Azure Synapse integrationsintegrations/azureml
: Azure ML integrationsintegrations/databricks
: Databricks integrationsThe text was updated successfully, but these errors were encountered: