Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Normalized Mutual Information #2

Open
zjming opened this issue Dec 7, 2018 · 4 comments
Open

Normalized Mutual Information #2

zjming opened this issue Dec 7, 2018 · 4 comments

Comments

@zjming
Copy link

zjming commented Dec 7, 2018

The extended NMI is between 0 and 1. But the nmi of example in overlap_nmi.py is 2.60794966304. Would you like to explain this question?

@Joey-Hu
Copy link

Joey-Hu commented Jul 21, 2020

I have same question. Anybody know why???

@matrixfang
Copy link

the code is a little wrong, the math.log(2,num) should be math.log(num,2)

@x3y1
Copy link

x3y1 commented Nov 15, 2020

The result of overlap_nmi.py will be -0.2354300373718371 if change the math.log(2, num) to be math.log(num, 2), which is incorrect either.

@Joey-Hu
Copy link

Joey-Hu commented Nov 18, 2020

Here is my code, it may work for you:

import math
import numpy as np

def calc_overlap_nmi(num_vertices, result_comm_list, ground_truth_comm_list):
return OverlapNMI(num_vertices, result_comm_list, ground_truth_comm_list).calculate_overlap_nmi()

class OverlapNMI:

def __init__(self, num_vertices, result_comm_list, ground_truth_comm_list):
    self.x_comm_list = result_comm_list
    self.y_comm_list = ground_truth_comm_list
    self.num_vertices = num_vertices

def calculate_overlap_nmi(self):

    # h(num)
    def h(num):
        if num > 0:
            return -1 * num * math.log2(num)
        else:
            return 0

    # H(X_i)
    def H_comm(comm):
        prob1 = float(len(comm)) / self.num_vertices
        prob2 = 1 - prob1
        return h(prob1) + h(prob2)

    # H(X)
    def H_cap(cap):
        res = 0.0
        for comm in cap:
            res += H_comm(comm)
        return res

    # H(X_i, Y_j)
    def H_Xi_joint_Yj(comm_x, comm_y):
        intersect_size = float(len(set(comm_x) & set(comm_y)))
        cap_n = self.num_vertices + 4
        prob11 = (intersect_size + 1) / cap_n
        prob10 = (len(comm_x) - intersect_size + 1) / cap_n
        prob01 = (len(comm_y) - intersect_size + 1) / cap_n
        # prob00 = 1 - prob11 - prob10 - prob01
        prob00 = (self.num_vertices - intersect_size + 1) / cap_n

        if (h(prob11) + h(prob00)) >= (h(prob01) + h(prob10)):
            return h(prob11) + h(prob10) + h(prob01) + h(prob00)
        else:
            return H_comm(comm_x) + H_comm(comm_y)

    # H(X_i|Y_j)
    def H_Xi_given_Yj(comm_x, comm_y):
        return float(H_Xi_joint_Yj(comm_x, comm_y) - H_comm(comm_y))

    # H(X_i|Y)  return min{H(Xi|Yj)} for all j
    def H_Xi_given_Y(comm_x, cap_y):
        tmp_H_Xi_given_Yj = []
        for comm_y in cap_y:
            tmp_H_Xi_given_Yj.append(H_Xi_given_Yj(comm_x, comm_y))
        return float(min(tmp_H_Xi_given_Yj))

    # H(Xi|Y)_norm
    def H_Xi_given_Y_norm(comm_x, cap_y):
        return float(H_Xi_given_Y(comm_x, cap_y) / H_comm(comm_x))

    # # H(X|Y)
    # def H_X_given_Y(cap_x, cap_y):
    #     res = 0.0
    #     for comm_x in cap_x:
    #         res += H_Xi_given_Y(comm_x, cap_y)

    #     return res

    # H(X|Y)_norm
    def H_X_given_Y_norm(cap_x, cap_y):
        res = 0.0
        for comm_x in cap_x:
            res += H_Xi_given_Y_norm(comm_x, cap_y)

        return res / len(cap_x)

    def NMI(cap_x, cap_y):
        if len(cap_x) == 0 or len(cap_y) == 0:
            return 0
        return 1 - 0.5 * (H_X_given_Y_norm(cap_x, cap_y) + H_X_given_Y_norm(cap_y, cap_x))

    return NMI(self.x_comm_list, self.y_comm_list)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants