Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timeout for huge UfedChats #2304

Closed
aberenguel opened this issue Aug 22, 2024 · 10 comments
Closed

Timeout for huge UfedChats #2304

aberenguel opened this issue Aug 22, 2024 · 10 comments
Labels

Comments

@aberenguel
Copy link
Contributor

I'm processing an UFDR that contais chats with ~50k messages.
That is triggering timeout in UfedChatParser.

Since we don't have bytes in ufed chat item stream, the timeout doesn't consider the size of the chat.

@aberenguel
Copy link
Contributor Author

aberenguel commented Aug 23, 2024

In an example, a msgstore.db of 2.052 MB had 1.118.533 messages. It results in 545 messages / MB (disregarding other data in database).
Considering timeoutPerMB = 2, it results 2 seconds to process 545 messages or ~4 seconds per 1000 messages.

aberenguel added a commit to aberenguel/IPED that referenced this issue Aug 23, 2024
@lfcnassif
Copy link
Member

lfcnassif commented Aug 23, 2024

Hi @aberenguel, thanks for reporting this and proposing a PR! I would like to suggest an alternative fix: just putting contentHandler.characters(""); in the loop that is taking most time into the parser should reset the timeout counter, if it works, it would require much less code changes.

@lfcnassif lfcnassif added the bug label Aug 23, 2024
@aberenguel
Copy link
Contributor Author

aberenguel commented Aug 23, 2024

Great! I think handler.characters() is already being done in ParsingTask.parseEmbedded (adding chat html fragments).
Maybe the timeout is happening due to another reasons.

 [WARN]  [engine.task.AbstractTask]  Worker-32 TIMEOUT processing item04.ufdr/_DecodedData/Chat/Chat_39b3e7f2-6593-40dd-a39a-b387d5b3ab4e (null bytes)     iped.engine.io.TimeoutException

This specific chat has 3591 messages.

@aberenguel
Copy link
Contributor Author

Is it possible to get worker thread dump when TIMEOUT occurs?

@lfcnassif
Copy link
Member

Is it possible to get worker thread dump when TIMEOUT occurs?

I think it is possible using JMS.

@aberenguel
Copy link
Contributor Author

Hi @aberenguel, thanks for reporting this and proposing a PR! I would like to suggest an alternative fix: just putting contentHandler.characters(""); in the loop that is taking most time into the parser should reset the timeout counter, if it works, it would require much less code changes.

@lfcnassif The suggestion of contentHandler.characters worked, with a little modification as bellow. If I pass empty string, the timeout counter is not reset.

char[] nameChars = (message.getName() + "\n").toCharArray();
handler.characters(nameChars, 0, nameChars.length);

It was caused by pre-processing messages in #2286.

@aberenguel
Copy link
Contributor Author

I think this issue can be closed.

@lfcnassif
Copy link
Member

Thanks @aberenguel. So it doesn't happen in last release or master branch, right?

@lfcnassif
Copy link
Member

Closing, but I would appreciate the answer for question above, thank you!

@aberenguel
Copy link
Contributor Author

Thanks @aberenguel. So it doesn't happen in last release or master branch, right?

I haven't seen that in master. Only in the branch I'm working. The contentHandler tip mentioned solved the problem

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants