Implementig Ad predictor #1642

slach31 · 2024-11-17T00:55:15Z

No description provided.

implemented the rls, no tests yet

comments

MaxHalford · 2024-11-17T16:01:22Z

river/base/adpredictor.py

+    beta (float, default=0.1):
+    A smoothing parameter that regulates the weight updates. Smaller values allow for finer updates,
+    while larger values can accelerate convergence but may risk instability.
+    prior_probability (float, default=0.5):
+    The initial estimate rate. This value sets the bias weight, influencing the model's predictions
+    before observing any data.
+
+    epsilon (float, default=0.1):
+    A variance dynamics parameter that controls how the model balances prior knowledge and learned information.
+    Larger values prioritize prior knowledge, while smaller values favor data-driven updates.
+
+    num_features (int, default=10):
+    The maximum number of features the model can handle. This parameter affects scalability and efficiency,
+    especially for high-dimensional data.


You need to follow the docstring syntax we use everywhere else in River. Take a look at the source code of another model for examples :)

MaxHalford · 2024-11-17T16:02:13Z

river/base/adpredictor.py

+    adpredictor = AdPredictor(beta=0.1, prior_probability=0.5, epsilon=0.1, num_features=5)
+    data = [({"feature1": 1, "feature2": 1}, 1),({"feature1": 1, "feature3": 1}, 0),({"feature2": 1, "feature4": 1}, 1),({"feature1": 1, "feature2": 1, "feature3": 1}, 0),({"feature4": 1, "feature5": 1}, 1),]
+    def train_and_test(model, data):
+    for x, y in data:
+    pred_before = model.predict_one(x)
+        model.learn_one(x, y)
+        pred_after = model.predict_one(x)
+        print(f"Features: {x} | True label: {y} | Prediction before training: {pred_before:.4f} | Prediction after training: {pred_after:.4f}")
+
+    train_and_test(adpredictor, data)
+
+    Features: {'feature1': 1, 'feature2': 1} | True label: 1 | Prediction before training: 0.5000 | Prediction after training: 0.7230
+    Features: {'feature1': 1, 'feature3': 1} | True label: 0 | Prediction before training: 0.6065 | Prediction after training: 0.3650
+    Features: {'feature2': 1, 'feature4': 1} | True label: 1 | Prediction before training: 0.6065 | Prediction after training: 0.7761
+    Features: {'feature1': 1, 'feature2': 1, 'feature3': 1} | True label: 0 | Prediction before training: 0.5455 | Prediction after training: 0.3197
+    Features: {'feature4': 1, 'feature5': 1} | True label: 1 | Prediction before training: 0.5888 | Prediction after training: 0.7699


Same: take a look at another model source code for an example. You should be able to run the docstring test with pytest river/base/adpredictor.py

MaxHalford · 2024-11-17T16:02:22Z

river/base/adpredictor.py

+
+    """
+
+    config = namedtuple("config", ["beta", "prior_probability", "epsilon", "num_features"])


What is this for?

MaxHalford · 2024-11-17T16:02:48Z

river/base/adpredictor.py

+    def prior_bias_weight(self):
+        # Calculate initial bias weight using prior probability
+
+        return np.log(self.prior_probability / (1 - self.prior_probability)) / self.beta


We prefer using Python's standard library. So here you'll have to use math.log. This is also the case for other parts of the code

MaxHalford · 2024-11-17T16:04:13Z

river/base/adpredictor.py

+            self.weights[feature]["mean"] = mean + mean_delta
+            self.weights[feature]["variance"] = variance * variance_multiplier


I think it's cleaner to have two dicts: one to hold the means and how to hold the variances

MaxHalford · 2024-11-17T16:04:30Z

river/base/adpredictor.py

+    def __str__(self):
+        # String representation of the model for easy identification
+        return "AdPredictor"


There is no need for this

MaxHalford · 2024-11-17T16:04:51Z

river/base/adpredictor.py

+    return {"mean": 0.0, "variance": 1.0}
+
+
+class AdPredictor(Classifier):


I guess this model can live in the linear_model module!

MaxHalford · 2024-11-17T16:05:26Z

river/base/adpredictor.py

@@ -0,0 +1,154 @@
+from __future__ import annotations
+
+from collections import defaultdict, namedtuple


We prefer to import the package, and then access its properties, instead of importing what we need

Suggested change

from collections import defaultdict, namedtuple

import collections

…to ad_predict

slach31 · 2024-11-26T19:00:18Z

Hello!
This pull request was closed so that we can open a fresh pull request containing the proposed fixes to AdPredictor. See the latest Adpredictor pull request for the new code

W0lfgunbl00d and others added 14 commits November 5, 2024 14:49

v1test

6bf81df

implemented the rls, no tests yet

Update rls.py

6aa4869

comments

Added an v0 adpredictor

814fd8a

Added an v0 adpredictor

b309747

Merge branch 'online-ml:main' into main

069cee9

adpredictor algorithm

6a229d8

add adpredictor

e0e6c75

added an adpredictor function

ff8c617

remooved adpredictor here

67e7e14

fixed bugs

6f43ec8

fixed bugs

89cd67e

removed rls

54a94e3

Fix test pre commit

4a7bc49

Fixed imports

7311788

slach31 requested review from MaxHalford and smastelini as code owners November 17, 2024 00:55

MaxHalford reviewed Nov 17, 2024

View reviewed changes

Mo3ad-S and others added 4 commits November 26, 2024 19:47

adjusted the adpredictor algorithm

648b1a4

Merge branch 'ad_predict' of https://github.com/slach31/riverIDLIB in…

322b924

…to ad_predict

updated the rest of the project

dcb0f98

Merge branch 'online-ml:main' into ad_predict

2855d11

slach31 closed this Nov 26, 2024

Mo3ad-S mentioned this pull request Nov 26, 2024

Implementing ad predictor #1652

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementig Ad predictor #1642

Implementig Ad predictor #1642

slach31 commented Nov 17, 2024

MaxHalford Nov 17, 2024

MaxHalford Nov 17, 2024

MaxHalford Nov 17, 2024

MaxHalford Nov 17, 2024

MaxHalford Nov 17, 2024

MaxHalford Nov 17, 2024

MaxHalford Nov 17, 2024

MaxHalford Nov 17, 2024

slach31 commented Nov 26, 2024


		"""

		config = namedtuple("config", ["beta", "prior_probability", "epsilon", "num_features"])

		self.weights[feature]["mean"] = mean + mean_delta
		self.weights[feature]["variance"] = variance * variance_multiplier

		return {"mean": 0.0, "variance": 1.0}


		class AdPredictor(Classifier):

		@@ -0,0 +1,154 @@
		from __future__ import annotations

		from collections import defaultdict, namedtuple

	from collections import defaultdict, namedtuple
	import collections

Implementig Ad predictor #1642

Implementig Ad predictor #1642

Conversation

slach31 commented Nov 17, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

slach31 commented Nov 26, 2024