stm/prevalence issue #272

yuanyuan0105 · 2022-05-19T04:54:25Z

I tried to run a stm function as below, but got an error message:
"Error in stm(documents = out$documents, vocab = out$vocab, K = 0, data = out$meta, : number of observations in content covariate (1) prevalence covariate (20263) and documents (20263) are not all equal."

the code I have is like this:

stmfit <- stm(documents = out$documents, vocab = out$vocab,
K = 0 ,data = out$meta, prevalence =~ timenum,
max.em.its = 75,seed=24601,
init.type = "Spectral", verbose = FALSE,
control <- list(tSNE_init.dims=80))

I did not specify "content =" argument in my code as I see some examples only have "prevalence" as well.
So I would like to know what causes this error and how to solve it?

Many thanks

santoroma · 2022-05-19T08:47:47Z

Hello yuanyuan0105 Can you post a reproducible example of your code and data? *------------------------------------------------------------* *Mario Santoro* *Mobile: +393286654333* *Email: ***@***.*** ***@***.***>* Vizualize.me <http://vizualize.me/santoro.ma#> Il giorno gio 19 mag 2022 alle ore 06:54 yuanyuan0105 < ***@***.***> ha scritto:

…

I tried to run a stm function as below, but got an error message: "Error in stm(documents = out$documents, vocab = out$vocab, K = 0, data = out$meta, : number of observations in content covariate (1) prevalence covariate (20263) and documents (20263) are not all equal." the code I have is like this: stmfit <- stm(documents = out$documents, vocab = out$vocab, K = 0 ,data = out$meta, prevalence =~ timenum, max.em.its = 75,seed=24601, init.type = "Spectral", verbose = FALSE, control <- list(tSNE_init.dims=80)) I did not specify "content =" argument in my code as I see some examples only have "prevalence" as well. So I would like to know what causes this error and how to solve it? Many thanks — Reply to this email directly, view it on GitHub <#272>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACGE334EGEUJSWYWMJ6Z7V3VKXCQ7ANCNFSM5WK5C44Q> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

yuanyuan0105 · 2022-05-20T02:19:29Z

Hi @santoroma,

I attached the dataset and my code below

https://docs.google.com/spreadsheets/d/1eStIhewnnMxmYG0MEDgYz3euJThRsjPV4YELpldatlk/edit?usp=sharing

library(stm)
processed <- textProcessor(data_english$text, metadata = data_english)
out <- prepDocuments(processed$documents, processed$vocab, processed$meta)
First_STM <- stm(documents = out$documents, vocab = out$vocab,
K = 0,data = out$meta, prevalence =~ s(timenum),
init.type = "Spectral", verbose = FALSE,
control <- list(tSNE_init.dims=80))

Thanks much for your help in advance!

bfisseler · 2022-08-25T13:35:01Z

It's very likely that you got missings in your covariates. STM currently cannot handle missing values: "6Note that the model does not permit estimation when there are variables used in the model that have missing values. As such, it can be helpful to subset data to observations that do not have missing values for metadata that will be used in the STM model."

Roberts, M. E., Stewart, B. M. & Tingley, D. (2019). stm: An R Package for Structural Topic Models. Journal of Statistical Software, 91, 1–40. https://doi.org/10.18637/jss.v091.i02

JvH13 · 2023-06-06T13:30:02Z

I am having the same issue. I followed the instructions in #144, but I don't have missing values. What puzzles me is why it throws an error about the content covariate (1), while I do not have a content covariate in my model. The prevalence covariate and document covariate have equal lengths and no missing values.

vandytripp · 2024-01-31T18:03:11Z

I am having the same issue. I followed the instructions in #144, but I don't have missing values. What puzzles me is why it throws an error about the content covariate (1), while I do not have a content covariate in my model. The prevalence covariate and document covariate have equal lengths and no missing values.

I am having the same issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

stm/prevalence issue #272

stm/prevalence issue #272

yuanyuan0105 commented May 19, 2022

santoroma commented May 19, 2022 via email

yuanyuan0105 commented May 20, 2022

bfisseler commented Aug 25, 2022 •

edited

Loading

JvH13 commented Jun 6, 2023

vandytripp commented Jan 31, 2024

stm/prevalence issue #272

stm/prevalence issue #272

Comments

yuanyuan0105 commented May 19, 2022

santoroma commented May 19, 2022 via email

yuanyuan0105 commented May 20, 2022

bfisseler commented Aug 25, 2022 • edited Loading

JvH13 commented Jun 6, 2023

vandytripp commented Jan 31, 2024

bfisseler commented Aug 25, 2022 •

edited

Loading