You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
One of the primary recommendations for interfacing with Databricks is via dbplyr translation to Spark SQL through the Databricks ODBC driver (using it in a similar way to any other backend). However, Spark SQL apparently doesn't support temporary tables, only temporary views, which is a major breaking change to all compute() calls within my code (and possibly some joins and other things I've not yet run into as well).
Error in `db_save_query.DBIConnection()`:
! Can't save query to "dbplyr_001".
Caused by error in `new_result()`:
! nanodbc/nanodbc.cpp:1691: 00000: [Simba][Hardy] (80) Syntax or semantic analysis error thrown in server while executing query. Error message from server: org.apache.hive.service.cli.HiveSQLException: Error running query: [_LEGACY_ERROR_TEMP_0035] org.apache.spark.sql.catalyst.parser.ParseException:
Operation not allowed: CREATE TEMPORARY TABLE ... AS ..., use CREATE TEMPORARY VIEW instead.(line 1, pos 0)
== SQL ==
CREATE TEMPORARY TABLE
^^^
`dbplyr_001` AS
SELECT DISTINCT
`member_id`,
`claim_id`,
`claimline_nbr`,
`claimline_status`,
`place_of_service_cd`,
`hcpcs_cd`,
`revenue_cd`,
CAST(`allowed_amt` AS DOUBLE) AS `allowed_amt`,
CAST(`paid_amt` AS DOUBLE) AS `paid_amt`,
CAST(`claim_start_dt` AS DATE) AS `claim_start_dt`,
CAST(`claim_end_dt` AS DATE) AS `claim_end_dt`
FROM `predictive_rewrite`.`upmc`.`claimline`
at org.apache.spark.sql.hive.thriftserver.HiveThriftServerErrors$.runningQueryError(HiveThriftServerErrors.scala:48)
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.$anonfun$execute$1(SparkExecuteStatementOperation.scala:693)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at com.databricks.unity.UCSEphemeralState$Handle.runWith(UCSEphemeralState.scala:41)
at com.databricks.unity.HandleImpl.runWith(UCSHandle.scala:99)
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:571)
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.$anonfun$run$2(SparkExecuteStatementOperation.scala:422)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at com.databricks.logging.UsageLogging.withAttributionContext(UsageLogging.scala:420)
at com.databricks.logging.UsageLogging.withAttributionContext$(UsageLogging.scala:418)
at com.databricks.spark.util.PublicDBLogging.withAttributionContext(DatabricksSparkUsageLogger.scala:25)
at com.databricks.logging.UsageLogging.withAttributionTags(UsageLogging.scala:470)
at com.databricks.logging.UsageLogging.withAttributionTags$(UsageLogging.scala:455)
at com.databricks.spark.util.PublicDBLogging.withAttributionTags(DatabricksSparkUsageLogger.scala:25)
at com.databricks.spark.util.PublicDBLogging.withAttributionTags0(DatabricksSparkUsageLogger.scala:70)
at com.databricks.spark.util.DatabricksSparkUsageLogger.withAttributionTags(DatabricksSparkUsageLogger.scala:170)
at com.databricks.spark.util.UsageLogging.$anonfun$withAttributionTags$1(UsageLogger.scala:495)
at com.databricks.spark.util.UsageLogging$.withAttributionTags(UsageLogger.scala:607)
at com.databricks.spark.util.UsageLogging$.withAttributionTags(UsageLogger.scala:616)
at com.databricks.spark.util.UsageLogging.withAttributionTags(UsageLogger.scala:495)
at com.databricks.spark.util.UsageLogging.withAttributionTags$(UsageLogger.scala:493)
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withAttributionTags(SparkExecuteStatementOperation.scala:64)
at org.apache.spark.sql.hive.thriftserver.ThriftLocalProperties.$anonfun$withLocalProperties$8(ThriftLocalProperties.scala:161)
at com.databricks.spark.util.IdentityClaim$.withClaim(IdentityClaim.scala:48)
at org.apache.spark.sql.hive.thriftserver.ThriftLocalProperties.withLocalProperties(ThriftLocalProperties.scala:160)
at org.apache.spark.sql.hive.thriftserver.ThriftLocalProperties.withLocalProperties$(ThriftLocalProperties.scala:51)
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withLocalProperties(SparkExecuteStatementOperation.scala:64)
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:400)
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:385)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878)
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2.run(SparkExecuteStatementOperation.scala:434)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Caused by: org.apache.spark.sql.catalyst.parser.ParseException:
Operation not allowed: CREATE TEMPORARY TABLE ... AS ..., use CREATE TEMPORARY VIEW instead.(line 1, pos 0)
No great way to make this a reprex, but wanted to understand any workarounds that may exist and/or request an argument or package-level option to switch between temporary views and tables interchangeably to avoid issues with compute(). The naive workaround at the moment is to just override dbplyr::compute() in my code with a user-defined function that takes and returns the generated SQL query, thus allowing the compute() calls to remain and not actually compute anything.
The other way of connecting to Databricks remotely, via Sparklyr and Databricks-Connect, is also having its own fair share of issues at the moment, otherwise that would be the workaround I'd adopt.
The text was updated successfully, but these errors were encountered:
Hey there,
One of the primary recommendations for interfacing with Databricks is via dbplyr translation to Spark SQL through the Databricks ODBC driver (using it in a similar way to any other backend). However, Spark SQL apparently doesn't support temporary tables, only temporary views, which is a major breaking change to all
compute()
calls within my code (and possibly some joins and other things I've not yet run into as well).No great way to make this a reprex, but wanted to understand any workarounds that may exist and/or request an argument or package-level option to switch between temporary
views
andtables
interchangeably to avoid issues withcompute()
. The naive workaround at the moment is to just overridedbplyr::compute()
in my code with a user-defined function that takes and returns the generated SQL query, thus allowing thecompute()
calls to remain and not actually compute anything.The other way of connecting to Databricks remotely, via Sparklyr and Databricks-Connect, is also having its own fair share of issues at the moment, otherwise that would be the workaround I'd adopt.
The text was updated successfully, but these errors were encountered: