Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: UDF_CACHING persistence mode persists input if persistent_id is set. #59

Open
KamilPiechowiak opened this issue Jun 12, 2024 · 1 comment
Assignees
Labels
bug Something isn't working

Comments

@KamilPiechowiak
Copy link
Contributor

Steps to reproduce

This code persists input, I am not sure if it should. Notice that persistence_mode is set to UDF_CACHING:

import pathway as pw


class InSchema(pw.Schema):
    a: int
    b: int


t = pw.io.csv.read("a.csv", persistent_id="abc", schema=InSchema, mode="static")

persistence_backend = pw.persistence.Backend.filesystem("./xyz")
persistence_config = pw.persistence.Config.simple_config(
    persistence_backend,
    persistence_mode=pw.PersistenceMode.UDF_CACHING,
)
pw.debug.compute_and_print_update_stream(t, persistence_config=persistence_config)

If you run the code twice, you'll see that the values are read from persistence on the second run.

Relevant log output

First run:
            | a | b | __time__      | __diff__
^31NXFBM... | 1 | 3 | 1718180081298 | 1
^TC3B0CF... | 2 | 4 | 1718180081298 | 1
^VH8R9JC... | 3 | 5 | 1718180081298 | 1


Second run:
            | a | b | __time__ | __diff__
^31NXFBM... | 1 | 3 | 0        | 1
^TC3B0CF... | 2 | 4 | 0        | 1
^VH8R9JC... | 3 | 5 | 0        | 1

What did you expect to happen?

UDF_CACHING mode not persisting the input even if persistent_id is set or error that the persistent_id is set in UDF_CACHING mode.

Version

0.12.0

Docker Versions (if used)

No response

OS

Linux

On which CPU architecture did you run Pathway?

None

@KamilPiechowiak KamilPiechowiak added the bug Something isn't working label Jun 12, 2024
@embe-pw
Copy link
Member

embe-pw commented Jun 13, 2024

In general the persistence_mode is not documented enough.
I agree that it is confusing that enabling UDF caching enables the rest of the persistence mechanisms.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants