Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

levels2 error in nodemix when levels are named #563

Open
CarterButts opened this issue Jun 12, 2024 · 3 comments
Open

levels2 error in nodemix when levels are named #563

CarterButts opened this issue Jun 12, 2024 · 3 comments

Comments

@CarterButts
Copy link

The levels2 argument to nodemix seems to be doing some odd things. Here's a cleaned up example from a case my students encountered in an exam:

Data file (sorry about the zip archive, github won't take Rdata files): example.zip

library(ergm)
load("example.Rdata")

mixingmatrix(fgf,"gender")                                       #Get the mixing matrix by gender
summary(fgf ~ nodemix("gender", levels2 = c(2:3)))               #This is correct
summary(fgf ~ nodemix("gender", levels2 = c("1.0","0.1")))       #This is not

summary(ergm(fgf~nodemix("gender", levels2 = c(2:3))))           #This one makes some sense
summary(ergm(fgf~nodemix("gender", levels2 = c("1.0","0.1"))))   #This one doesn't

summary(ergm(fgf~edges+nodemix("gender", levels2 = c(2:3))))           #Works as it should
summary(ergm(fgf~edges+nodemix("gender", levels2 = c("1.0","0.1"))))   #Odd collinearity

The mixing matrix we get is like so:

     To
From    0   1 Sum
  0    28   2  30
  1     7  66  73
  Sum  35  68 103

and this is consistent with the first use of levels2:

mix.gender.1.0 mix.gender.0.1 
             7              2 

but the second case gives us odd things:

mix.gender.0.1 mix.gender.1.0 
            73             30 

As one would expect, this spills over into ergm, to the point where you get odd stuff like an edges term being redundant with the mixing terms. It's puzzling that levels2 isn't just grabbing the wrong entry from the mixing matrix - it looks like it is taking the row sums of the matrix, instead. (Which shouldn't be possible for nodemix, but there it is.) That wouldn't perhaps be so bad if the results were then labeled as such, but they are labeled with the original arguments.

The levels2 argument is very powerful, but I confess that I myself get confused in some cases about how it is supposed to work! Still, this can't be intended behavior. I presume that there's a parsing step that's gone wrong somewhere....

@mbojan
Copy link
Member

mbojan commented Jun 12, 2024

The documentation could be improved.... and perhaps supplemented with examples.

load("~/Downloads/example.Rdata")

(mm <- mixingmatrix(fgf, "gender"))
#>      To
#> From    0   1 Sum
#>   0    28   2  30
#>   1     7  66  73
#>   Sum  35  68 103

Numeric vector is interpreted as an index of the cells of the mixing matrix (in the usual column order):

summary(fgf ~ nodemix("gender", levels2=1)) # (1,1)
#> mix.gender.0.0 
#>             28
summary(fgf ~ nodemix("gender", levels2=3)) # (2,1)
#> mix.gender.0.1 
#>              2
summary(fgf ~ nodemix("gender", levels2=2:3)) # (1,2) and (2,1)
#> mix.gender.1.0 mix.gender.0.1 
#>              7              2

Character vector is interpreted as a matrix, it collapses/aggregates the cells assigned the same values:

summary(fgf ~ nodemix("gender", levels2=c("foo", "foo", "bar", "bar")))
#> mix.gender.bar mix.gender.foo 
#>             68             35
colSums(mm)
#>  0  1 
#> 35 68

because

matrix(c("foo", "foo", "bar", "bar"), 2, 2)
#>      [,1]  [,2] 
#> [1,] "foo" "bar"
#> [2,] "foo" "bar"

and this is interpreted as matrix(“foobar”, 2, 2) so sums the whole matrix:

summary(fgf ~ nodemix("gender", levels2="foobar"))
#> mix.gender.foobar 
#>               103
sum(mm)
#> [1] 103

I know nothing about parsing character vectors of the form “x.y”.

@krivit
Copy link
Member

krivit commented Jun 13, 2024

The documentation could, indeed, be much better. If you want to select level pairs explicitly, you need to pass something like

summary(fgf ~ nodemix("gender", levels2 = I(list(list(row=1,col=0),list(row=0,col=1)))))
mix.gender.1.0 mix.gender.0.1 
             7              2 

I'll try to reply in more detail later.

krivit added a commit that referenced this issue Sep 29, 2024
@krivit
Copy link
Member

krivit commented Sep 29, 2024

That'll do for the upcoming release, but we really should augment the vignette with diagrams, mixing matrices, etc..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants