Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spark Atlas Connector not Capturing DDL Operations From spark-sql #290

Open
har5havardhan opened this issue Feb 17, 2020 · 1 comment
Open

Comments

@har5havardhan
Copy link

hi,

I've setup a basic version of atlas and it works perfectly with hive and all the DDL operations and Lineage is being captured by atlas

changes made in hive-site.xml

hive.exec.post.hooks org.apache.atlas.hive.hook.HiveHook

But when I try to create a table using spark-sql or spark-shell the DDL commands and lineage are not captured by atlas.

please help me with what I am doing wrong.

I launch spark-sql using the below command

spark-sql --jars /home/hadoop/harsha/spark-atlas-connector/spark-atlas-connector-assembly/target/spark-atlas-connector-assembly-0.1.0-SNAPSHOT.jar --conf spark.extraListeners=com.hortonworks.spark.atlas.SparkAtlasEventTracker --conf spark.sql.queryExecutionListeners=com.hortonworks.spark.atlas.SparkAtlasEventTracker --conf spark.sql.streaming.streamingQueryListeners=com.hortonworks.spark.atlas.SparkAtlasStreamingQueryEventTracker

@wForget
Copy link

wForget commented Jun 9, 2020

I have made some modifications to get the lineage of spark SQL operation hive, which can be used as a reference, but I am not sure if there will be other problems.
https://github.com/wForget/spark-atlas-connector/tree/dev-hive

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants