Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: vectorizer is not restarted after error #335

Open
dberardo-com opened this issue Jan 2, 2025 · 4 comments
Open

[Bug]: vectorizer is not restarted after error #335

dberardo-com opened this issue Jan 2, 2025 · 4 comments
Labels
bug Something isn't working community pgai

Comments

@dberardo-com
Copy link

What happened?

vectorizer failed to find the ollama container the first time and gave error, then it was never retried

pgai extension affected

No response

pgai library affected

No response

PostgreSQL version used

see readme

What operating system did you use?

docker

What installation method did you use?

Docker

What platform did you run on?

Other

Relevant log output and stack trace

No response

How can we reproduce the bug?

see desc

Are you going to work on the bugfix?

None

@dberardo-com dberardo-com added bug Something isn't working community pgai labels Jan 2, 2025
@dberardo-com
Copy link
Author

i have figured out that the vectorizer worker retries the failed jobs not ASAP, but waits for its regular 5 minutes interval to retry it.

i think this is the desired behavior?

a different question: is it possible for a single worker to pickup multiple jobs in parallel? right now i have 2 very long lasting jobs that are never executed in parallel by a single worker (started with the default -c 4 arg of the docker image CMD), so i have to start 2 worker containers in order to have parallel execution.

@alejandrodnm
Copy link
Contributor

The worker only executes concurrently for the same vectorizer. The -c 4 argument will spawn 4 concurrent tasks to work on a vectorizer queue. Vectorizer queues are processed sequentially.

If you want to work both vectorizers at the same time, then the solution is what you did. Multiple workers, and use the -i flag to specify which vectorizers should each worker process:

https://github.com/timescale/pgai/blob/main/docs/vectorizer.md#monitor-a-vectorizer

@dberardo-com
Copy link
Author

ok, if a vectorizer encounters an error, it currently waits 5 minutes before retrying. is this the desired behavior ?

@JamesGuthrie
Copy link
Member

ok, if a vectorizer encounters an error, it currently waits 5 minutes before retrying. is this the desired behavior ?

Yes, but TBH we haven't thought too hard about this, so we could change it if there's a strong motivation for it.

The vectorizer worker processes vectorizers every "poll interval" (configurable through the --poll-interval argument). The default poll interval is 5 minutes. This is the mechanism that causes the retry behaviour that you see - it's not an explicit "retry on error".

We could additionally build in some "retry on error" behaviour (with a configurable retry count, and backoff behaviour), but didn't see the need for it yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working community pgai
Projects
None yet
Development

No branches or pull requests

3 participants