Skip to content

LDA - how to obtain top terms per topic? #830

Answered by raphaelsty
mdscruggs-onna asked this question in Q&A
Discussion options

You must be logged in to vote

Hi @mdscruggs-onna 😀,

Unfortunately the LDA does not provide a method that returns the top tokens per topic.

Here's a solution I'm thinking of and that gives interesting results. Rather than using nu_1 and nu_2 directly, I would use the _compute_weights method which combines nu_1 and nu_2 to provide weights per token and topic with respect to LDA algorithm. After getting the weights, I do think it might be relevant to apply the softmax function.

import numpy as np
from river import compose, feature_extraction, preprocessing

X = [
    "weather cold",
    "weather hot dry",
    "weather cold rainy",
    "weather hot",
    "weather cold humid",
]

lda = compose.Pipeline(
    feature_extraction

Replies: 1 comment 3 replies

Comment options

You must be logged in to vote
3 replies
@mdscruggs-onna
Comment options

@mdscruggs-onna
Comment options

@raphaelsty
Comment options

Answer selected by mdscruggs-onna
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants