Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

score_one method modifies anomaly.LocalOutlierFactor internal state unintentionally? #1331

Closed
MarekWadinger opened this issue Oct 3, 2023 · 4 comments

Comments

@MarekWadinger
Copy link
Contributor

Versions

river version: development
Python version: Python 3.10.1
Operating system: macOS Sonoma 14.0 (23A344)

Describe the bug

Hello 👋

I found out that score_one method of anomaly.LocalOutlierFactor changes internal state of the detector.

If this is not intentional I'd like to assist in resolving this issue. I tried to trace back the error and found out that the problem is connected to expand_objects function, which creates only a references to original variables. Following modification to lines 349-369 in anomaly.LocalOutlierFactor HERE, helped me to preserve the state over score_one calls.

        (
            nm,
            x_list_copy,
            neighborhoods,
            rev_neighborhoods,
            k_dist,
            reach_dist,
            dist_dict,
            local_reach,
            lof,
        ) = expand_objects(
            self.x_scores,
            x_list_copy,
            self.neighborhoods,
            self.rev_neighborhoods,
            self.k_dist,
            self.reach_dist,
            self.dist_dict,
            self.local_reach,
            self.lof,
        )

to

        import copy
        (
            nm,
            x_list_copy,
            neighborhoods,
            rev_neighborhoods,
            k_dist,
            reach_dist,
            dist_dict,
            local_reach,
            lof,
        ) = expand_objects(
            self.x_scores,
            x_list_copy,
            self.neighborhoods.copy(),
            self.rev_neighborhoods.copy(),
            self.k_dist.copy(),
            copy.deepcopy(self.reach_dist),
            copy.deepcopy(self.dist_dict),
            self.local_reach.copy(),
            self.lof.copy(),
        )

Please, let me know if that makes sense, and I'd be happy to elaborate on any related issues or comments.

Thank you 🙏

Steps/code to reproduce

from river import anomaly

lof = anomaly.LocalOutlierFactor()

X = [{"a": 1, "b": 1}, {"a": 1, "b": 1}]
for x in X:
    lof.learn_one(x)

print(
    lof.x_list,
    lof.neighborhoods,
    lof.rev_neighborhoods,
    lof.k_dist,
    lof.reach_dist,
    lof.dist_dict,
    lof.local_reach,
    lof.lof, sep="\n"
    )

lof.score_one({"a": 0.5, "b": 1})

print(
    lof.x_list,
    lof.neighborhoods,
    lof.rev_neighborhoods,
    lof.k_dist,
    lof.reach_dist,
    lof.dist_dict,
    lof.local_reach,
    lof.lof, sep="\n"
    )

returns

[{'a': 1, 'b': 1}, {'a': 1, 'b': 1}]
{0: [1], 1: [0]}
{0: [1], 1: [0]}
{0: 0.0, 1: 0.0}
{0: {1: 0.0}, 1: {0: 0.0}}
{0: {1: 0.0}, 1: {0: 0.0}}
{0: 0, 1: 0}
{0: 0, 1: 0}

[{'a': 1, 'b': 1}, {'a': 1, 'b': 1}]
{0: [1, 2], 1: [0, 2], 2: [0, 1]}
{0: [1, 2], 1: [0, 2], 2: [0, 1]}
{0: 0.5, 1: 0.5, 2: 0.5}
{0: {1: 0.5, 2: 0.5}, 1: {0: 0.5, 2: 0.5}, 2: {0: 0.5, 1: 0.5}}
{0: {1: 0.0, 2: 0.5}, 1: {0: 0.0, 2: 0.5}, 2: {0: 0.5, 1: 0.5}}
{0: 2.0, 1: 2.0, 2: 2.0}
{0: 1.0, 1: 1.0, 2: 1.0}
@MaxHalford
Copy link
Member

Good job spotting this! Indeed score_one should be stateless.

@smastelini
Copy link
Member

Can we close this?

@MaxHalford
Copy link
Member

Yep go for it :)

@smastelini
Copy link
Member

Closed via #1330.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants