Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[question] Treating high-cardinality categorical features as numeric #4278

Closed
sflender opened this issue May 11, 2021 · 2 comments
Closed

[question] Treating high-cardinality categorical features as numeric #4278

sflender opened this issue May 11, 2021 · 2 comments
Labels

Comments

@sflender
Copy link

Hi LightGBM team,

in the documentation you write that

"For a categorical feature with high cardinality (#category is large), it often works best to treat the feature as numeric [...]"

This is also what I find empirically for cardinalities above 1K-10K. Do you have insights why this is the case? Why does the Fisher algorithm not work well in this case?

@sflender
Copy link
Author

sflender commented Jun 1, 2021

resolving as a duplicate of #1934

@sflender sflender closed this as completed Jun 1, 2021
@github-actions
Copy link

This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants