Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can i do SOM for multiple variables? #189

Open
jiwon-j opened this issue Jun 20, 2024 · 4 comments
Open

Can i do SOM for multiple variables? #189

jiwon-j opened this issue Jun 20, 2024 · 4 comments
Labels

Comments

@jiwon-j
Copy link

jiwon-j commented Jun 20, 2024

I have two datasets, geopotential height (GPH) and Precipitation. Each variable has the same time dimension (year).
I want to do clustering for each year, and I was wondering if and how I can run SOM considering "both" variables (not doing SOM for each variable individually).

I tried with np.hstack, but this merges each variable array, so I'm not sure if it's accurate.
(if GPH have (year:20, flatten_values:500) shape and Precipitation have (year:20, flatten_values:500), np.hstack made it into (year:20, flatten_values:10000, just attached it)
I was wondering if this is even possible.

@JustGlowing
Copy link
Owner

hi @jiwon-j, SOM is a multivariate model and you can build your input as a matrix where each row corresponds to a year and contains values from all the variables that you have.

These numpy functions can help you reshape your original data:

@jiwon-j
Copy link
Author

jiwon-j commented Jun 20, 2024

hi @jiwon-j, SOM is a multivariate model and you can build your input as a matrix where each row corresponds to a year and contains values from all the variables that you have.

These numpy functions can help you reshape your original data:

thank you! i made a combined array, but do the two variables in here have to have the same shape?
Trying to run a SOM and getting broadcast issues

@JustGlowing
Copy link
Owner

the input matrix needs to have only 2 dimensions, which means that you have to concatenate your data on the appropriate axis.

@vwgeiser
Copy link

vwgeiser commented Jun 21, 2024

@jiwon-j I ran into a similar problem and flattening (vectorizing) the input data into a 1D vector is how I got mine to work. Then making your input_len the length of one sample. You can see the later parts of #187 where show how I do this.
Unless you've found a way around it I would imagine that inputs need to be the same length (minisom requires a square matrix). Imputation may help with this?

Adds quite a bit of dimensionality but MiniSOM is able to handle this sort of data at the cost of dimensionality.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants