Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added ggplot2 transformation support for plot_nns_ratio #30

Open
wants to merge 1 commit into
base: updates
Choose a base branch
from

Conversation

xiliny
Copy link

@xiliny xiliny commented Aug 14, 2024

Dear authors,

In submitting this pull request, I added a new argument called scale_transform to plot_nns_ratio, which specifies the ggplot2 transformation of the x axis. I tested the changes with the following code (largely based on example(plot_nns_ratio)):

Example (click to expand)
library(ggplot2)
library(quanteda)
library(conText)

# tokenize corpus
toks <- tokens(cr_sample_corpus)

# build a tokenized corpus of contexts sorrounding a target term
immig_toks <- tokens_context(x = toks, pattern = "immigration", window = 6L)

# sample 100 instances of the target term, stratifying by party (only for example purposes)
set.seed(2022L)
immig_toks <- tokens_sample(immig_toks, size = 100, by = docvars(immig_toks, 'party'))

# we limit candidates to features in our corpus
feats <- featnames(dfm(immig_toks))

# compute ratio
set.seed(2022L)
immig_nns_ratio <- get_nns_ratio(x = immig_toks,
                                 N = 10,
                                 groups = docvars(immig_toks, 'party'),
                                 numerator = "R",
                                 candidates = feats,
                                 pre_trained = cr_glove_subset,
                                 transform = TRUE,
                                 transform_matrix = cr_transform,
                                 bootstrap = TRUE,
                                 # num_bootstraps should be at least 100,
                                 # we use 10 here due to CRAN-imposed constraints
                                 # on example execution time
                                 num_bootstraps = 100,
                                 permute = FALSE,
                                 num_permutations = 10,
                                 verbose = FALSE)

plot_nns_ratio(x = immig_nns_ratio, alpha = 0.01, horizontal = FALSE, scale_transform = "identity")
ggsave("vertical-identity.png")
plot_nns_ratio(x = immig_nns_ratio, alpha = 0.01, horizontal = TRUE, scale_transform = "identity")
ggsave("horizontal-identity.png")
plot_nns_ratio(x = immig_nns_ratio, alpha = 0.01, horizontal = FALSE, scale_transform = "log")
ggsave("vertical-log.png")
plot_nns_ratio(x = immig_nns_ratio, alpha = 0.01, horizontal = TRUE, scale_transform = "log")
ggsave("horizontal-log.png")

Here are the resulting plots

scale_transform horizonal=TRUE horizonal=FALSE
"identity" horizontal-identity vertical-identity
"log" horizontal-log vertial-log

This is just a basic implementation. Let me know if you want me to make any further changes!

@prodriguezsosa
Copy link
Owner

Thanks @xiliny, code looks good. Question: what's the motivation for using a log transform here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants