You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
This is not a problem per se, but AgglomerativeClustering and SpectralClustering in sklearn.cluster is not always favorable especially for large datasets due to its numerical scaling (benchmark at HDBSCAN docs. For example, personally I usually use genieclust, and would like to use it instead of sklearn clusterers, which is impossible in the current implementation.
Describe the solution you'd like
A Clusterer base class for interfacing both sklearn and other types of clusterers by inheritance can be implemented and its instance (or class itself) can be given as an argument while splitting. Or it can be some if-else statements in datasail.cluster.clustering.additional_clustering(), but it might be less elegant.
Describe alternatives you've considered
Alternatively, sklearn clusterers can be replaced with ones from fastclust package.
The text was updated successfully, but these errors were encountered:
Thank you for your feedback and suggestions. We will definitely consider these for future versions and improvements of DataSAIL. Customized clustering is indeed something we haven't thought about and implemented yet.
Is your feature request related to a problem? Please describe.
This is not a problem per se, but
AgglomerativeClustering
andSpectralClustering
insklearn.cluster
is not always favorable especially for large datasets due to its numerical scaling (benchmark at HDBSCAN docs. For example, personally I usually use genieclust, and would like to use it instead of sklearn clusterers, which is impossible in the current implementation.Describe the solution you'd like
A
Clusterer
base class for interfacing both sklearn and other types of clusterers by inheritance can be implemented and its instance (or class itself) can be given as an argument while splitting. Or it can be some if-else statements indatasail.cluster.clustering.additional_clustering()
, but it might be less elegant.Describe alternatives you've considered
Alternatively, sklearn clusterers can be replaced with ones from
fastclust
package.The text was updated successfully, but these errors were encountered: