Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update the reference paper for the SAC #268

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -111,6 +111,10 @@ When running on cluster node, you will also need to distribute this keytab, belo

When Spark application is started, it will transparently track the execution plan of submitted SQL/DF transformations, parse the plan and create related entities in Atlas.

Reference
===
- Mingjie Tang, Saisai Shao, Weiqing Yang, Yanbo Liang, Yongyang Yu, Bikas Saha, Dongjoon Hyun. [SAC: A System for Big Data Lineage Tracking](http://merlintang.github.io/paper/sac_icde.pdf). In IEEE 35th International Conference on Data Engineering (ICDE), 2019
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than referring an individual github link why not just have the pdf as a part of the SAC repo itself?

Instead of a separate section we could just have it listed under the "Spark Atlas Connector" section in the README. Just a line as below with the link.
SAC: A System for Big Data Lineage Tracking

Copy link
Collaborator

@HeartSaVioR HeartSaVioR Jun 26, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't include anything unless it's safe to claim that it's under copyright of Cloudera, or the license of paper is clear to be compatible with Apache License V2. Even it is compatible, we need to explicitly mention it to LICENSE. So why not just link it to avoid dealing with any license issue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is removed

Copy link
Collaborator

@HeartSaVioR HeartSaVioR Jun 26, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No I meant link to external would be OK unless restricted. It's a different story if we "include" the paper in repo as a part of content, so I pointed out for that.


License
=======

Expand Down