Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CORE] Add Gluten Project Improvement Proposals (GPIP) doc #8133

Merged
merged 3 commits into from
Dec 5, 2024

Conversation

yikf
Copy link
Contributor

@yikf yikf commented Dec 3, 2024

What changes were proposed in this pull request?

This PR aims to add Gluten Project Improvement Proposals doc.

Gluten is growing rapidly, and many major optimizations are expected for it. To follow the Apache way, we should have a specification for major optimizations, and this documentation refers to Spark's SPIP.

How was this patch tested?

doc only

@github-actions github-actions bot added the DOCS label Dec 3, 2024
Copy link

github-actions bot commented Dec 3, 2024

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/apache/incubator-gluten/issues

Then could you also rename commit message and pull request title in the following format?

[GLUTEN-${ISSUES_ID}][COMPONENT]feat/fix: ${detailed message}

See also:

@yikf
Copy link
Contributor Author

yikf commented Dec 3, 2024

@zhztheplayer @jackylee-ch When we were doing the #7750, found that the gluten project there is no specifications of significant change, so we want to add this document. cc @zjuwangg

@zhztheplayer
Copy link
Member

zhztheplayer commented Dec 3, 2024

Thank you for your proposal. Glad to see our development procedure gets more mature over time because of effort like this. +1 for the idea overall.

Additionally, as Gluten is now still under comparatively earlier stage of its lifetime[1], I'll also suggest we slow down a little bit on making something like GPIP a hard restriction of large contribution. I wonder contributor may hesitate on contributing critical features because of the workload of writing a comprehensive proposal. While filing a good document before conducting contribution could be highly encouraged and appreciated.

My key point is that we as a community may lower our posture a bit and thank for all kinds of contributions anyway. It doesn't mean lowering the bar of code review at all, since committers must take responsibility to keep the code healthy. I am just thinking if it's more feasible to let contributors make their own decisions on whether to provide a GPIP or brief information before opening a feature PR. Indeed, providing more information will help a lot on pushing the PR to merge since the reviewers could be clearer on the overall design and purpose of the change.

So at the moment I am thinking if we can remove or rephrase the following words (or similar words from other places), or make a clearer statement on the flexibility on whether to file the document.

When in doubt, if a committer thinks a change needs a GPIP, it does.

Thanks.

[1] Per my personal perspective. Say Velox is still under incubation either.

@yikf
Copy link
Contributor Author

yikf commented Dec 3, 2024

@zhztheplayer Thanks for your suggestion, I totally agree with you, Let's rephrase the documentation to make the community more healthy and sustainable.

@yikf
Copy link
Contributor Author

yikf commented Dec 3, 2024

Hi @zhztheplayer , Please take a look if you find a time, thanks

@PHILO-HE PHILO-HE changed the title [CORE] Add GPIP(Gluten Project Improvement Proposals) doc [CORE] Add Gluten Project Improvement Proposals (GPIP) doc Dec 3, 2024
@PHILO-HE
Copy link
Contributor

PHILO-HE commented Dec 3, 2024

cc @zhouyuan @FelixYBW @baibaichen

Copy link
Member

@zhztheplayer zhztheplayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you also mark out the source in the doc, perhaps in the first few of lines? Assuming it's originated from Apache Spark or another project? Thanks!

@zhouyuan
Copy link
Contributor

zhouyuan commented Dec 4, 2024

@yikf Thanks for adding this and I like this idea. +1 from me.
I happen to read some Flink design/features recently and I personally think FLIP is also a good example to follow:
https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals

thanks, -yuan

@yikf
Copy link
Contributor Author

yikf commented Dec 4, 2024

Would you also mark out the source in the doc, perhaps in the first few of lines? Assuming it's originated from Apache Spark or another project? Thanks!

Do you think it should be as a comment or as the first few lines of documentation?

@yikf
Copy link
Contributor Author

yikf commented Dec 4, 2024

@yikf Thanks for adding this and I like this idea. +1 from me. I happen to read some Flink design/features recently and I personally think FLIP is also a good example to follow: https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals

thanks, -yuan

Thank you for your advice. I am open to both options.

@zhztheplayer
Copy link
Member

Would you also mark out the source in the doc, perhaps in the first few of lines? Assuming it's originated from Apache Spark or another project? Thanks!

Do you think it should be as a comment or as the first few lines of documentation?

I think it's better to add to the documentation, namely improvement-proposals.md. Thanks.

Copy link
Member

@zhztheplayer zhztheplayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor

@PHILO-HE PHILO-HE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Just some trivial comments. Thanks!


The purpose of a GPIP is to inform and involve the user community in major improvements to the Gluten codebase throughout the development process to increase the likelihood that user needs are met. GPIPs should be used for significant user-facing or cross-cutting changes, not small incremental improvements.

If your proposal meets the definition of GPIP, we recommend you to create a GPIP, which will facilitate the advancement and discussion of the proposal, but it is not mandatory, and we welcome any contribute and community participation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

contribute -> contribution?


## What is a GPIP?

An GPIP is similar to a product requirement document commonly used in product management.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A GPIP


An GPIP is similar to a product requirement document commonly used in product management.

An GPIP:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto. Please check all other places.


The Gluten Project Improvement Proposals doc references [the Spark SPIP documentation](https://spark.apache.org/improvement-proposals.html).

The purpose of a GPIP is to inform and involve the user community in major improvements to the Gluten codebase throughout the development process to increase the likelihood that user needs are met. GPIPs should be used for significant user-facing or cross-cutting changes, not small incremental improvements.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does cross-cutting mean? Use cutting-edge instead?

Copy link
Contributor

@PHILO-HE PHILO-HE Dec 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit. Please break this long line of statements into multiple lines, which is friendly to reading. Please also check other places.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, addressed, PTAL.

Copy link
Contributor

@PHILO-HE PHILO-HE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@FelixYBW
Copy link
Contributor

FelixYBW commented Dec 5, 2024

Thank you for your proposal! Yes, definitely we need better design doc, discusstion for the big features like the stage level resource management. We need to map the features to roadmap.

Also I think it's improtant we also need a single page to track all the proposals like Flink one: under discussion, accepted, released, discard. Looks it's pretty clear.

Meanwhile see if we have similar way to orgnize the issues. Currently we used several trackers like the #4652 but only 3 issues are be pin to top. Is there any good suggestions to do this? Is the cwiki a good way?

@FelixYBW FelixYBW merged commit 57a713a into apache:main Dec 5, 2024
4 checks passed
@yikf
Copy link
Contributor Author

yikf commented Dec 5, 2024

Thanks all for help on the PR. @FelixYBW @zhouyuan @PHILO-HE @jackylee-ch .

It does look very clear with a filing and categorization GPIP. A wiki is also a good choice.

Personal opinion, also for reference, is that it would be best to link the archiving and categorization actions to the GPIPs process, so developers don't need to update the archived content based on the lifecycle of GPIPs.

If a GPIP is initiated, an GitHub ISSUE will definitely be created to track it. We can record the link to the ISSUEs category in the GPIP documentation, such as "under discussion," which will link to a list that includes both the GPIP and "under discussion."

@yikf yikf deleted the GPIP branch December 5, 2024 02:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants