Use proper normalization in Hamming #250

sdmccabe · 2019-09-22T21:30:44Z

Harrison pointed out in a comment on our paper that our Hamming implementation
has an implicit $N^2$ instead of $N(N-1)$ normalization, so it's wrong for
graphs without selfloops. This corrects that, similar to #242.

A couple of notes:

I think this could be a little cleaner; the fact that np.triu_indices() et
al return 2-tuples cramped my style a bit.
The fact that this and Use N or N-1 instead of empirical kmax in degree histogram #242 exist raise concern that this normalization issue
may be present elsewhere. Perhaps we should open a checklist issue, like we have
for Validating distances against reference implementations #245?
I have not applied the same correction to HammingIpsenMikhailov, on the
grounds that: (i) it's sufficiently different from regular Hamming to consider
separately, and (ii) it probably deserves a more thorough cleanup.

Harrison pointed out in a comment on our paper that our Hamming implementation has an implicit $N^2$ instead of $N(N-1)$ normalization, so it's wrong for graphs without selfloops. This corrects that, similar to #242. A couple of notes: 1. I think this could be a little cleaner; the fact that `np.triu_indices()` et al return 2-tuples cramped my style a bit. 2. The fact that this and #242 exist raise concern that this normalization issue may be present elsewhere. Perhaps we should open a checklist issue, like we have for #245? 3. I have not applied the same correction to `HammingIpsenMikhailov`, on the grounds that: (i) it's sufficiently different from regular `Hamming` to consider separately, and (ii) it probably deserves a more thorough cleanup.

leotrs

I'm approving in spite of my comments, feel free to ignore them.

Also, I was going to say we could think of normalization in terms of #174, but since this was truly implicit I'm not sure how...

leotrs · 2019-09-23T13:04:30Z

netrd/distance/hamming.py

+        # directed case: consider all but the diagonal
+        if nx.is_directed(G1) or nx.is_directed(G2):
+            new_mask = np.tril_indices(N, k=-1)
+            mask = (np.append(mask[0], new_mask[0]), np.append(mask[1], new_mask[1]))


This looks like it can be achieved in a single append or hstack call? No need to change, just thinking out loud here..

Can it be? It's a tuple of np.arrays, so (i) they're immutable, and (ii) they're two separate objects. I think what I wrote is pretty ugly and I don't like it, but I wasn't sure about a better way to do it.

Ooooh you're right it returns a tuple because it's stupid. What about

mask = np.array(np.triu_indices(N, k=1))

?

That doesn't seem to work.

In [12]: m = np.triu_indices(10, k=1) In [13]: M = np.array(m) In [14]: A = np.zeros((10,10)) In [15]: len(A[M]) Out[15]: 2 In [16]: len(A[m]) Out[16]: 45 In [17]: A[M].shape Out[17]: (2, 45, 10)

Yeah, ok. There must be a way to do it nicely, but I don't care enough right now...

Agreed. There's some DRY here so I thought about wrapping it in a function, but the idea of needing a function to do that bothered me...

netrd/distance/hamming.py

…ng-tweak

leotrs approved these changes Sep 23, 2019

View reviewed changes

sdmccabe added 2 commits September 23, 2019 09:56

Merge branch 'master' of https://github.com/netsiphd/netrd into hammi…

9949b33

…ng-tweak

use False instead of None to be explicit that we're evaluating truth

d71d6e1

leotrs approved these changes Sep 23, 2019

View reviewed changes

Merge branch 'master' into hamming-tweak

5cf921d

sdmccabe merged commit 0a2eeb0 into netsiphd:master Sep 23, 2019

sdmccabe deleted the hamming-tweak branch September 23, 2019 20:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use proper normalization in Hamming #250

Use proper normalization in Hamming #250

sdmccabe commented Sep 22, 2019

leotrs left a comment

leotrs Sep 23, 2019

sdmccabe Sep 23, 2019

leotrs Sep 23, 2019

sdmccabe Sep 23, 2019

leotrs Sep 23, 2019

sdmccabe Sep 23, 2019

Use proper normalization in Hamming #250

Use proper normalization in Hamming #250

Conversation

sdmccabe commented Sep 22, 2019

leotrs left a comment

Choose a reason for hiding this comment

leotrs Sep 23, 2019

Choose a reason for hiding this comment

sdmccabe Sep 23, 2019

Choose a reason for hiding this comment

leotrs Sep 23, 2019

Choose a reason for hiding this comment

sdmccabe Sep 23, 2019

Choose a reason for hiding this comment

leotrs Sep 23, 2019

Choose a reason for hiding this comment

sdmccabe Sep 23, 2019

Choose a reason for hiding this comment