-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dk-series distance #243
dk-series distance #243
Conversation
First pass at the dk-series (2k-series) distance. **Do not merge without discussion.** This is essentially the `DegreeDivergence`, but instead of the degree distribution it's the distribution of edges between degree-labelled nodes. Some outstanding questions and concerns: 1. This is not memory-efficient because it uses NxN dense matrices. 2. Have I understood the dk-series correctly? That is, does the `dk2_series` function return something meaningful? 3. I'm not sure if the dk-series is defined for directed graphs. For simplicity I have coerced to undirected graphs.
netrd/distance/dk2_distance.py
Outdated
|
||
""" | ||
|
||
def dk2_series(G): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Being pedantic: can we put this outside the distance class? This is a prime candidate for refactoring if/when #174 ever becomes a thing.
netrd/distance/dk2_distance.py
Outdated
D1 = np.zeros((N, N)) | ||
D2 = np.zeros((N, N)) | ||
|
||
for (i, j), k in G1_dk.items(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth it to look into making all of these use sparse matrices? Pretty sure COO matrices would speed this up by a bunch. Even with small N, COO matrices would avoid this loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I knew you would call me out on this! I'll look into it. How would it avoid the loop, though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uuuuh I guess it wouldn't avoid it, but at least we would be off-loading it to scipy, and I trust them to profile their loops.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, that's fine. I think it should be pretty straightforward to change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wound up using DOK matrices instead because it seemed most convenient. I'm open to reconsidering this if people have strong opinions about sparse matrix formats.
Awesome! Just curious: did you find a paper that uses this? If so, please add the reference at the top. If not, please add some general reference to dk series at the top. (Curious how you decided to use JSD?) |
There's no paper (beside the dk-series papers), it's part of the graphend project. |
Gonna tag @jkbren in on this. He might also have thoughts on names. |
I'm cool with merging. The outstanding issue is the name. I see a couple possibilities here:
The second is fine with me; the third might be better? I don't like the first. |
Other names raised:
Another possibility is to make it a general |
Let's do the last thing! |
Should we merge |
Yes just call it.
…On Mon, Aug 12, 2019 at 6:06 PM Stefan McCabe ***@***.***> wrote:
Should we merge DegreeDivergence into this, or just call it? My instinct
is the latter, since people unfamiliar with the dk series might be confused.
—
You are receiving this because your review was requested.
Reply to this email directly, view it on GitHub
<#243?email_source=notifications&email_token=AAILYAAMI34ONTCSESE5DATQEHNFTA5CNFSM4II7GIX2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4D62MI#issuecomment-520613169>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAILYAAPTN325R3KHTZD5ETQEHNFTANCNFSM4II7GIXQ>
.
--
www.leotrs.com | [email protected]
PhD student at the Network Science Institute, Northeastern University
|
Can someone double-check the docs to make sure it all still reads right under the newly expanded goal of the module? Also, I ran the tests with |
If someone other than me ends up merging this, please squash and merge, this PR wound up being a ton of commits. |
Leaving for my own reference: I will try to add some doc tweaks to clarify that |
First pass at the dk-series (2k-series) distance. Do not merge without
discussion. This is essentially the
DegreeDivergence
, but instead of thedegree distribution it's the distribution of edges between degree-labelled
nodes. Some outstanding questions and concerns:
dk2_series
function return something meaningful?
I have coerced to undirected graphs.