Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible to get WCC results as a file? #4

Open
erikmata opened this issue Jun 5, 2020 · 4 comments
Open

Possible to get WCC results as a file? #4

erikmata opened this issue Jun 5, 2020 · 4 comments

Comments

@erikmata
Copy link

erikmata commented Jun 5, 2020

Hi! I'm trying out G-Store to find weakly connected components and I'm wondering if it's possible to get the results from the WCC algorithm as a file.

As results, I mean: for each vertex ID -> component ID
Eg.
Vertex ID, Component ID
1, 1
2, 1
3, 1
4, 4
5, 4

Looking into the code, I can see that component ID:s are uint32, which makes me wonder if the algorithm can be used for graphs with more than 5 billion vertices?

@pradeep-k
Copy link
Member

Printing vertex ID and is component ID: You can write code similar to following in wcc.cpp around line number 338.

for (vertex_t j = 0; j < vert_count; ++j) {
   cout << j << ":" << vert_cid[j] << endl;
}

Yes, the current algorithm can be used for over 5 billion vertices, but upper limit is unknown. Our initialization code is slightly different, so we don't initialize vertex with its its ID as component ID. Nonetheless, using 64 bits will not have any such issues.

Let me know, if you need any other information.

@erikmata
Copy link
Author

erikmata commented Jun 5, 2020

Thank you for your quick reply! Looking into the code, I see that vert_cid is defined as a uint32 array (cid_t), so from your reply I gather that I could just replace all cid_t references with vertex_t instead? In order to be able to have more than 5 billion components. My graph has many small connected components, in the order of more than 5 billion of them. Not a regular social media graph where one single WCC normally spans the whole graph.

@pradeep-k
Copy link
Member

I think it should work if you change cid_t to vertex_t at line 39 of wcc.h.

@erikmata
Copy link
Author

erikmata commented Jun 5, 2020

Yes, I did a search-replace and it seems to work fine.

Thanks for the help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants