Naïve Bayes

Denote the set of features by , and the set of classes by .

Given the full training data of size , let be the number of samples that belong to class . We estimate the frequency of each class by

For a discrete feature ,

where is the smoothing prior and is the total number of values of .

For a continuous feature , the parameters of are estimated by maximum likelihood estimation.

For numerical stability, you may instead compute

Check out the documentation listed below to view the attributes that are available in sklearn but not exposed to the user in the software.

Further readings

sklearn tutorial on Naïve Bayes.

sklearn LogisticRegression documentation.

Provide feedback