No n_features_to_select parameter #92

bgalvao · 2021-01-26T10:03:51Z

Although I understand that Boruta is, by design, an all-relevant feature selection method, it would be nice to have the option to select a specified number of features.

As of right now, BorutaPy presents ranking 1 through 3 (relevant, tentative, rejected).

I am thinking of looking through the statistical tests and return the ranking by p-value. If you like this issue and have a clear idea how to implement it, let me know.

I am trying to work on it on my fork.

DreHar · 2021-01-26T10:21:02Z

I know this doesnt directly answer your question. When I want to minimize the features I often do a feature reduction after the all relevant feature selection step. Forward or backward stepwise feature elimination depending on whether you want choose very few features or only drop a few respectively. I have also found that some simulated annealing helps a lot in practice.

This might help in practice because highly correlated features will all have high p values. So you might throw out features which are less statistically relevant but have more orthogonal value.

Sorry for the tangent but thought it might help

bgalvao changed the title ~~No n_features parameter~~ No n_features_to_select parameter Jan 26, 2021

bgalvao mentioned this issue Jan 26, 2021

added hack-support for n_features_to_select #93

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

No n_features_to_select parameter #92

No n_features_to_select parameter #92

bgalvao commented Jan 26, 2021

DreHar commented Jan 26, 2021

No n_features_to_select parameter #92

No n_features_to_select parameter #92

Comments

bgalvao commented Jan 26, 2021

DreHar commented Jan 26, 2021