Log loss (#500)

* feat : log_loss_calc function added * fix : minor edit in log_loss_calc function * fix : brier score error messages renamed * fix : tests updated * fix : tests updated * fix : tests link updated * fix : autopep8 * doc : CHANGELOG updated * doc : Document updated * doc : wikipedia link added to log_loss * fix : log_loss_calc tests updated * doc : minor edit in log_loss_calc function docstring
sepandhaghighi · Apr 27, 2023 · 3e4d09a · 3e4d09a
1 parent dfbaa17
commit 3e4d09a
Show file tree

Hide file tree

Showing 8 changed files with 206 additions and 5 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -10,6 +10,7 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
 - `__imbalancement_handler__` function
 - `vector_serializer` function
 - NPV micro/macro
+- `log_loss` method
 - 23 new distance/similarity
 	1. Dennis 
 	2. Digby

diff --git a/Document/Document.ipynb b/Document/Document.ipynb
@@ -231,6 +231,7 @@
     "        <li><a href=\"#Weighted-alpha\">Weighted Alpha</a></li>\n",
     "        <li><a href=\"#Aickin's-alpha\">Aickin's Alpha</a></li>\n",
     "        <li><a href=\"#Brier-score\">Brier Score</a></li>\n",
+    "        <li><a href=\"#Log-loss\">Log Loss</a></li>\n",
     "    </ol>\n",
     "    &nbsp;\n",
     "    <li><a href=\"#Print\">Print</a></li>\n",
@@ -12251,6 +12252,105 @@
     "</ul>"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Log loss"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "In information theory, the cross-entropy between two probability distributions \n",
+    "$p$ and $q$ over the same underlying set of events measures the average number of bits needed to identify an event drawn from the set if a coding scheme used for the set is optimized for an estimated probability distribution $q$, rather than the true distribution $p$.\n",
+    "This is also known as the log loss (logarithmic loss or logistic loss); the terms \"log loss\" and \"cross-entropy loss\" are used interchangeably. [[30]](#ref30).\n",
+    "\n",
+    "<a href=\"https://en.wikipedia.org/wiki/Cross_entropy\">Wikipedia Page</a>"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "$$L_{\\log}(y, p) = -(y \\log (p) + (1 - y) \\log (1 - p))$$"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "cm_test.log_loss()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "cm_test.log_loss(pos_class=0)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Parameters "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "1. `pos_class` : positive class name (type : `int/str`, default : `None`)\n",
+    "2. `normalize` : normalization flag (type : `bool`, default : `True`)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Output"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "`Log loss`"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<ul>\n",
+    "    <li><span style=\"color:red;\">Notice </span> :  new in <span style=\"color:red;\">version 3.9</span> </li>\n",
+    "</ul>"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<ul>\n",
+    "    <li><span style=\"color:red;\">Notice </span> :  This option only works in binary probability mode</li>\n",
+    "</ul>"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<ul>\n",
+    "    <li><span style=\"color:red;\">Notice </span> :  `pos_class` always defaults to the greater class name (i.e. `max(classes)`), unless, the `actual_vector` contains string. In that case, `pos_class` does not have any default value, and it must be explicitly specified or else an error will result.</li>\n",
+    "</ul>"
+   ]
+  },
   {
    "cell_type": "markdown",
    "metadata": {},

diff --git a/Test/error_test.py b/Test/error_test.py
@@ -148,11 +148,19 @@
 Traceback (most recent call last):
         ...
 pycm.pycm_error.pycmVectorError: This option only works in binary probability mode
+>>> cm.log_loss()
+Traceback (most recent call last):
+        ...
+pycm.pycm_error.pycmVectorError: This option only works in binary probability mode
 >>> cm = ConfusionMatrix(["ham", "spam", "ham", "ham"], [0.1, 0.4, 0.25, 1], threshold=lambda x : "ham")
 >>> cm.brier_score()
 Traceback (most recent call last):
         ...
 pycm.pycm_error.pycmVectorError: Actual vector contains string so pos_class should be explicitly specified
+>>> cm.log_loss()
+Traceback (most recent call last):
+        ...
+pycm.pycm_error.pycmVectorError: Actual vector contains string so pos_class should be explicitly specified
 >>> matrix = [[1, 2, 3], [4, 6, 1], [1, 2, 3]]
 >>> cm = ConfusionMatrix(matrix=matrix, classes=["L1", "L1", "L3", "L2"])
 Traceback (most recent call last):

diff --git a/Test/function_test.py b/Test/function_test.py
@@ -300,8 +300,12 @@
 'None'
 >>> brier_score_calc([1, 0], [0.8, 0.3, 0.2, 0.4], [1, 1, 0, 1], sample_weight=None, pos_class=None)
 0.23249999999999998
+>>> log_loss_calc([1, 0], [0.8, 0.3, 0.2, 0.4], [1, 1, 0, 1], sample_weight=None, pos_class=None)
+0.6416376597071276
 >>> brier_score_calc([1, "0"], [0.8, 0.3, 0.2, 0.4], [1, 1, 0, 1], sample_weight=None, pos_class=None)
 'None'
+>>> log_loss_calc([1, "0"], [0.8, 0.3, 0.2, 0.4], [1, 1, 0, 1], sample_weight=None, pos_class=None)
+'None'
 >>> vector_check([1, 2, 3, 0.4])
 False
 >>> vector_check([1, 2, 3,-2])

diff --git a/Test/verified_test.py b/Test/verified_test.py
@@ -378,6 +378,27 @@
 >>> cm5 = ConfusionMatrix(y_true, np.array(y_prob) > 0.5, threshold=lambda x: 1) # Verified Case -- (https: //bit.ly/3n8Uo7R)
 >>> cm5.brier_score()
 0.0
+>>> y_true = np.array([0, 1, 1, 0])
+>>> y_true_categorical = np.array(["spam", "ham", "ham", "spam"])
+>>> y_prob = np.array([0.1, 0.9, 0.8, 0.35])
+>>> cm1 = ConfusionMatrix(y_true, y_prob, threshold=lambda x: 1) # Verified Case -- (https://bit.ly/420uyVW)
+>>> cm1.log_loss()
+0.21616187468057912
+>>> cm1.log_loss(pos_class=1)
+0.21616187468057912
+>>> cm2 = ConfusionMatrix(y_true, 1-y_prob, threshold=lambda x: 1) # Verified Case -- (https://bit.ly/420uyVW)
+>>> cm2.log_loss(pos_class=0)
+0.21616187468057912
+>>> cm3 = ConfusionMatrix(y_true_categorical, y_prob, threshold=lambda x: "ham") # Verified Case -- (https://bit.ly/420uyVW)
+>>> cm3.log_loss(pos_class="ham")
+0.21616187468057912
+>>> cm3.log_loss(pos_class="ham", normalize=False)
+0.8646474987223165
+>>> cm4 = ConfusionMatrix(y_true, y_prob, sample_weight=[2, 2, 3, 3], threshold=lambda x: 1) # Verified Case -- (https://bit.ly/420uyVW)
+>>> cm4.log_loss()
+0.2383221464851297
+>>> cm4.log_loss(normalize=False)
+2.383221464851297
 >>> y1 = [1, 1, 0, 0, 0, 1]
 >>> y2 = [1, 0, 1, 1, 0, 1]
 >>> cm1 = ConfusionMatrix(y1, y2) # Verified Case -- (https: //bit.ly/3OWrZ00)

diff --git a/pycm/pycm_obj.py b/pycm/pycm_obj.py
@@ -6,7 +6,7 @@
 from .pycm_handler import __obj_assign_handler__, __obj_file_handler__, __obj_matrix_handler__, __obj_vector_handler__, __obj_array_handler__
 from .pycm_handler import __imbalancement_handler__
 from .pycm_class_func import F_calc, IBA_calc, TI_calc, NB_calc, sensitivity_index_calc
-from .pycm_overall_func import weighted_kappa_calc, weighted_alpha_calc, alpha2_calc, brier_score_calc
+from .pycm_overall_func import weighted_kappa_calc, weighted_alpha_calc, alpha2_calc, brier_score_calc, log_loss_calc
 from .pycm_distance import DistanceType, DISTANCE_MAPPER
 from .pycm_output import *
 from .pycm_util import *
@@ -912,16 +912,38 @@ def brier_score(self, pos_class=None):
         :return: Brier score as float
         """
         if self.prob_vector is None or not self.binary:
-            raise pycmVectorError(BRIER_SCORE_PROB_ERROR)
+            raise pycmVectorError(BRIER_LOG_LOSS_PROB_ERROR)
         if pos_class is None and isinstance(self.classes[0], str):
-            raise pycmVectorError(BRIER_SCORE_CLASS_ERROR)
+            raise pycmVectorError(BRIER_LOG_LOSS_CLASS_ERROR)
         return brier_score_calc(
             self.classes,
             self.prob_vector,
             self.actual_vector,
             self.weights,
             pos_class)
 
+    def log_loss(self, normalize=True, pos_class=None):
+        """
+        Calculate Log loss.
+
+        :param normalize: normalization flag
+        :type normalize: bool
+        :param pos_class: positive class name
+        :type pos_class: int/str
+        :return: Log loss as float
+        """
+        if self.prob_vector is None or not self.binary:
+            raise pycmVectorError(BRIER_LOG_LOSS_PROB_ERROR)
+        if pos_class is None and isinstance(self.classes[0], str):
+            raise pycmVectorError(BRIER_LOG_LOSS_CLASS_ERROR)
+        return log_loss_calc(
+            self.classes,
+            self.prob_vector,
+            self.actual_vector,
+            normalize,
+            self.weights,
+            pos_class)
+
     def position(self):
         """
         Return indices of TP, FP, TN and FN in the predict_vector.

diff --git a/pycm/pycm_overall_func.py b/pycm/pycm_overall_func.py
@@ -9,6 +9,51 @@
 from .pycm_util import complement
 
 
+def log_loss_calc(
+        classes,
+        prob_vector,
+        actual_vector,
+        normalize=True,
+        sample_weight=None,
+        pos_class=None):
+    """
+    Calculate Log loss.
+
+    :param classes: confusion matrix classes
+    :type classes: list
+    :param prob_vector: probability vector
+    :type prob_vector: python list or numpy array
+    :param actual_vector: actual vector
+    :type actual_vector: python list or numpy array
+    :param normalize: normalization flag
+    :type normalize: bool
+    :param sample_weight: sample weights list
+    :type sample_weight: list
+    :param pos_class: positive class name
+    :type pos_class: int/str
+    :return: Log loss as float
+    """
+    try:
+        vector_length = len(actual_vector)
+        if sample_weight is None:
+            sample_weight = [1] * vector_length
+        weight_sum = sum(sample_weight)
+        if pos_class is None:
+            pos_class = max(classes)
+        result = 0
+        for index, item in enumerate(actual_vector):
+            filtered_item = 0
+            if item == pos_class:
+                filtered_item = 1
+            result += -1 * (sample_weight[index] / weight_sum) * ((filtered_item * math.log(
+                prob_vector[index])) + (1 - filtered_item) * math.log(1 - prob_vector[index]))
+        if not normalize:
+            result = result * weight_sum
+        return result
+    except Exception:
+        return "None"
+
+
 def brier_score_calc(
         classes,
         prob_vector,

diff --git a/pycm/pycm_param.py b/pycm/pycm_param.py
@@ -106,9 +106,9 @@
 
 AVERAGE_INVALID_ERROR = "Invalid parameter!"
 
-BRIER_SCORE_CLASS_ERROR = "Actual vector contains string so pos_class should be explicitly specified"
+BRIER_LOG_LOSS_CLASS_ERROR = "Actual vector contains string so pos_class should be explicitly specified"
 
-BRIER_SCORE_PROB_ERROR = "This option only works in binary probability mode"
+BRIER_LOG_LOSS_PROB_ERROR = "This option only works in binary probability mode"
 
 CLASS_NUMBER_WARNING = "The confusion matrix is a high dimension matrix and won't be demonstrated properly.\n" \
                        "If confusion matrix has too many zeros (sparse matrix) you can set `sparse` flag to True in printing functions "\