-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Continue training in CLI if one iteration produces a single-leaf tree #5699
base: master
Are you sure you want to change the base?
Conversation
As per discussion on GH-microsoft#5051 and GH-microsoft#5193 the python package does not stop training if a single leaf tree (stupm) is found and relies on early stopping methods to stop training. This commits removes the finish condition on training based on the result of `TrainOneIter()` and sets the `is_finished` flag on early stopping alone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks very much for returning to this!
I've updated the PR title to more closely match the intent of this PR. Please let me know if I've misunderstood.
I think this PR matches the outcome of the discussion in #5051, but I'm not qualified to review this. Hopefully @shiyu1994 or @guolinke can help with a review.
@shiyu1994 can you help with a review please? This has been open for several months now. |
@@ -421,6 +419,8 @@ bool GBDT::TrainOneIter(const score_t* gradients, const score_t* hessians) { | |||
models_.push_back(std::move(new_tree)); | |||
} | |||
|
|||
++iter_; | |||
|
|||
if (!should_continue) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The one-leaf trees will be pop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested the result and stopping or using it as this is works the same (assumming there is no other built tree down the line). Incrementing the iter_
variable is needed for the boosting to take it as a finished tree (even if the tree is of no use). Otherwise it will continue forever.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry for the late response. I think we should make the behavior the same as the python-package. I remember that the one-leaf tree will be kept in the python-package, but the current implementation will pop the one-leaf trees.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Samsagax what do you think about my comment?
It's been almost a year since the last activity on this PR. @guolinke what should we do with this? |
"I think we should make the behavior the same as the python-package". |
As per discussion on #5051 and #5193 the python package does not stop training if a single leaf tree (stupm) is found and relies on early stopping methods to stop training. This commits removes the finish condition on training based on the result of
TrainOneIter()
and sets theis_finished
flag on early stopping alone.