Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spark 3.5: Display write metrics on SQL UI #11340

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

manuzhang
Copy link
Collaborator

No description provided.

@manuzhang manuzhang force-pushed the spark-write-metrics branch 3 times, most recently from 8a561bd to 316dc40 Compare October 17, 2024 15:03
@manuzhang
Copy link
Collaborator Author

Add number of total data files to write command AppendData

CleanShot 2024-10-18 at 11 31 55@2x

@manuzhang manuzhang force-pushed the spark-write-metrics branch from 316dc40 to 99d4b65 Compare October 18, 2024 04:44

object MetricsUtils {

def postDriverMetrics(sparkContext: SparkContext, metricValues: java.util.Map[CustomMetric, Long]): Unit = {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not needed if we can support it at Spark side.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that one got in :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but it will only be available in Spark 4+

Copy link

github-actions bot commented Dec 9, 2024

This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the [email protected] list. Thank you for your contributions.

@github-actions github-actions bot added the stale label Dec 9, 2024
@github-actions github-actions bot removed the stale label Dec 13, 2024
Copy link

This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the [email protected] list. Thank you for your contributions.

@wypoon
Copy link
Contributor

wypoon commented Jan 15, 2025

@manuzhang I am happy to see that someone is working on adding write-side Iceberg metrics to the Spark SQL UI!
I realize that this is still in a draft state, but I have some questions/suggestions.
Do you plan to add metrics only to append operations? It would be good to see them for other operations, such as delete and overwrite.
Would added data files be more useful than total data files? or we could show both?
For delete and overwrite operations, I think it would be useful to see removed data files, added delete files and removed delete files.


@Override
public String description() {
return "number of total data files";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't sound right.
"total" implies a number, so I think the description can just be "total data files".
If you really want to include "number", then "total number of data files".

@manuzhang
Copy link
Collaborator Author

@wypoon I plan to add metrics for all write operations, but I'd like to get the interfaces right at first. I'm not sure whether this is the best way to propagate a metricsReporter. Any thoughts?

    if (this.table instanceof BaseTable) {
      this.metricsReporter = new InMemoryMetricsReporter();
      ((BaseTable) this.table).combineMetricsReporter(metricsReporter);
    }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants