Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementig Ad predictor #1642

Closed
wants to merge 18 commits into from
Closed

Implementig Ad predictor #1642

wants to merge 18 commits into from

Conversation

slach31
Copy link

@slach31 slach31 commented Nov 17, 2024

No description provided.

Comment on lines 20 to 33
beta (float, default=0.1):
A smoothing parameter that regulates the weight updates. Smaller values allow for finer updates,
while larger values can accelerate convergence but may risk instability.
prior_probability (float, default=0.5):
The initial estimate rate. This value sets the bias weight, influencing the model's predictions
before observing any data.

epsilon (float, default=0.1):
A variance dynamics parameter that controls how the model balances prior knowledge and learned information.
Larger values prioritize prior knowledge, while smaller values favor data-driven updates.

num_features (int, default=10):
The maximum number of features the model can handle. This parameter affects scalability and efficiency,
especially for high-dimensional data.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to follow the docstring syntax we use everywhere else in River. Take a look at the source code of another model for examples :)

Comment on lines 49 to 64
adpredictor = AdPredictor(beta=0.1, prior_probability=0.5, epsilon=0.1, num_features=5)
data = [({"feature1": 1, "feature2": 1}, 1),({"feature1": 1, "feature3": 1}, 0),({"feature2": 1, "feature4": 1}, 1),({"feature1": 1, "feature2": 1, "feature3": 1}, 0),({"feature4": 1, "feature5": 1}, 1),]
def train_and_test(model, data):
for x, y in data:
pred_before = model.predict_one(x)
model.learn_one(x, y)
pred_after = model.predict_one(x)
print(f"Features: {x} | True label: {y} | Prediction before training: {pred_before:.4f} | Prediction after training: {pred_after:.4f}")

train_and_test(adpredictor, data)

Features: {'feature1': 1, 'feature2': 1} | True label: 1 | Prediction before training: 0.5000 | Prediction after training: 0.7230
Features: {'feature1': 1, 'feature3': 1} | True label: 0 | Prediction before training: 0.6065 | Prediction after training: 0.3650
Features: {'feature2': 1, 'feature4': 1} | True label: 1 | Prediction before training: 0.6065 | Prediction after training: 0.7761
Features: {'feature1': 1, 'feature2': 1, 'feature3': 1} | True label: 0 | Prediction before training: 0.5455 | Prediction after training: 0.3197
Features: {'feature4': 1, 'feature5': 1} | True label: 1 | Prediction before training: 0.5888 | Prediction after training: 0.7699
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same: take a look at another model source code for an example. You should be able to run the docstring test with pytest river/base/adpredictor.py


"""

config = namedtuple("config", ["beta", "prior_probability", "epsilon", "num_features"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this for?

def prior_bias_weight(self):
# Calculate initial bias weight using prior probability

return np.log(self.prior_probability / (1 - self.prior_probability)) / self.beta
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We prefer using Python's standard library. So here you'll have to use math.log. This is also the case for other parts of the code

Comment on lines 119 to 120
self.weights[feature]["mean"] = mean + mean_delta
self.weights[feature]["variance"] = variance * variance_multiplier
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's cleaner to have two dicts: one to hold the means and how to hold the variances

Comment on lines 152 to 154
def __str__(self):
# String representation of the model for easy identification
return "AdPredictor"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no need for this

return {"mean": 0.0, "variance": 1.0}


class AdPredictor(Classifier):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this model can live in the linear_model module!

@@ -0,0 +1,154 @@
from __future__ import annotations

from collections import defaultdict, namedtuple
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We prefer to import the package, and then access its properties, instead of importing what we need

Suggested change
from collections import defaultdict, namedtuple
import collections

@slach31 slach31 closed this Nov 26, 2024
@slach31
Copy link
Author

slach31 commented Nov 26, 2024

Hello!
This pull request was closed so that we can open a fresh pull request containing the proposed fixes to AdPredictor. See the latest Adpredictor pull request for the new code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants