Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

re-aggregation (a mapreduce approach) #4

Open
AltiMario opened this issue Dec 27, 2015 · 1 comment
Open

re-aggregation (a mapreduce approach) #4

AltiMario opened this issue Dec 27, 2015 · 1 comment

Comments

@AltiMario
Copy link
Collaborator

In the current scenario when you send a stream of messages to validate, they are analyzed with a multithreading techniques. It means that there is no sequential order respected during the elaboration.
Generally this is not a problem but in some cases yes. What happen if I have to validate data of a CVS file where I need to preserve the sequence?
The strategy adopted for the forecasting validation is to store the data into db, "synchronizing" this peace of code, and at the end analyze the ordered data with the forecasting algorithm.
It's a solution with too much overhead.
For the full integration with SeerCore I need to aggregate the streams into a unique file (because it's the standard input). It means that, if I want a solution multithreading I need to re-aggregate the file preserving the index (like in a mapreduce technique).

@mastrogiovanni
Copy link
Contributor

We need to discuss the architecture....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants