-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Journey on how to fix the pipeline ⛏️ #689
Comments
@larbizard this is a dup of #473 As you can see from https://github.com/e-mission/e-mission-docs/blob/6a0c1ec9ddc31036d64dfc087458bfa46b94cecf/docs/dev/archi/pipeline_details.md#curr_run_ts---while-processing-pipeline the curr_run_ts essentially acts as a lock for a particular (user, stage) combination. If two processes try to operate on the same stage for the same user, the process that enters the stage first grabs the lock. The process that tries to operate second cannot grab the lock and skips the stage. This means that you either:
You can fix it by resetting the pipeline ( |
Since last may 15 the pipeline started crashing again. Was the pipeline analysing data for your account ? Neither me nor Raouf have analysed data. We only have Draft icon on our phone since then. Same case for many users. Also we noticed in the logs that there are pipeline errors for some users. Theses users have probably not used the app since our last update, they maybe even deleted the app. Shall we purge them ? We are thinking of executing the pipeline only for specific users (Known active users who have installed the new version after april 29) using bin/debug/intake_single_user.py We tryed to execute the pipeline for Raouf and it passed but nothing was updated. (emission) root@e-mission-server-fabmob-qc:~/e-mission-server# ./e-mission-py.bash bin/debug/intake_single_user.py -u cbeb55e5969943a6911bb44b283b45d1 We reste again and we had the following error: (emission) root@e-mission-server-fabmob-qc:~/e-mission-server# ./e-mission-py.bash bin/debug/intake_single_user.py -u cbeb55e5969943a6911bb44b283b45d1 We tryed with my account also reset and execute the pipeline there where no errors but nothing changed I still have Draft trips. |
first,
means that either your reset failed, or you are running two copies of the pipeline at the same time. second,
is also similar, we expect there to be one inference per section and we found two.
wrt this can you send me the logs for executing your pipeline? If you executed the pipeline using |
Hi @shankari, I did a reset pipeline using:
And it seemed to work correctly no errors. Then I am using the following script to start the pipeline for my user:
Unfortunatly at the last step, :UUID aee1cab9-8e29-4ddd-a677-26dea5abd8a3: storing views to cache we have an error. Since the log file is too big I past it in this Google Doc file 👍 Thank you for your help, Larbi |
tl;dr: You are running
details The backtrace is
So it looks like there are two inferred sections for the cleaned section. The erroneous trip appears to be
And the related section seems to be:
Looking at the
It doesn't look like |
@shankari How shall I use the rest_pipeline to fix the issue ? Without the date ?
|
Also I saw that you noticed duplicates "aka duplicate entries from 2022-05-15". I have backed up twice the machine I am working using a snapshot from a week before. Is it possible that I created duplicates ? Does the app sync all the missing data or just partial data ? |
Seems unlikely given that we don't see duplicates anywhere else.
The duplicates are not in app sensed data, but in analysis results. |
As an aside, I note that all the pipeline timestamps are for May 14th. Did you stop collecting data then, or restore a backup from May or ??? Just trying to see if there is a reason why you don't have more recent data.
|
We did not stop collecting data. However after May 14th we notice that the data stopped beeing analysed for Raouf and me. At that time we tried to reset restpectivly my pipeline and his an also restart the intake single user analysis unsuccessfully. I restored a backup a month ago on June the 15th (Backups are done on mondays at 10 pm so probably a backup from the monday before june the 13th) Yesterday Is there a way to reset all analysis and restart them for all users. Also when I start a pipeline usin the intake single user pipeline I takes avout 30 minutes to 40 minutes to executes and meanwhile a scheduled bin/intake_multiprocess.py 3 is executed. That maybe what causes the first pipeline to crash. |
I wonder if this is related to the incorrect call for
Ah yes, you do not want to execute two pipelines in parallel. I would suggest pulling your data to your laptop and running the pipeline there to ensure that it works and debug the issues if any. Then, on production, just reset the pipeline and don't run the
As you can see, reset all analysis is the |
I turned off the crontab that executes the multiprocess. |
|
I used: e-mission-py.bash bin/purge_analysis_database.py According to the code anaylisis, common places common trips and pipeline states are removed : Then I executed : e-mission-py.bash bin/debug/intake_single_user.py -u aeelcab98e294ddda67726dea5abd8a3 I didn't output the logs in a file. I will execute the script again and send them. I searched for the last synched data on the server for my user and the last data is from may 15 th however other users had their data synched correctly yeasterday. My data:
All other users:
I had a look at my loggerDB file from my phone and filtered by "Sync" and I found the following error: ServerSyncAdapter: 10 Error java.io.IOException while posting converted trips to JSON I sent you by email a link to my loggerDB 1.4 gb. |
You don't need to, the output is automatically logged to
That seems to explain why all your recent trips are "draft". They were never sent to the server. |
This is pretty straightforward. You have not bumped up the message size on ngnix. From your logs:
|
Hi @shankari, Thank you for your help I was able to execute the pipeline after increasing the memory to 4GB 👍 . Thank you, Larbi This issue can be closed ✔️ Journey on how to fix the pipeline ⛏️I was able to fix my issue of the pipeline not excuting. ✔️ The discussion has diverged a little bit from the title of the issue. How to fix Assertion error in pipeline_queries.py AssertionError: curr_state.curr_run_ts = 1635335308.6691191I fixed the assertion error by using reset pipeline script:
How to fix missing data on the serverThe data was not synched between the user's phones and the server. The main cause was the server Nginx configuration not allowing Large files. After 2 months of not synchronizing due to another problem explained in the next section (Last trip end was at -1.0) the accumulated data to be synched was growing and reached arround 1 GB. To fix that issue the solution was to increase the message size in the Nginx configuration file:
Data not synched: root issue. The last trip didn't end corretlyBuiltinUserCache : Last trip end was at -1.0 If the data is not synched with the server check the loggerDB file in mobile app.
If you find the Solution:
Failing pipeline due to lack of memory Out Of Memory (OOM) 🔢./e-mission-py.bash: line 8: 3578 Killed PYTHONPATH=. python "$@" If you execute the pipeline and receive this issue, that's propbably because you have an Out Of Memory (OOM) issue and the process was killed. You will need to increase you memory. I used 4GB to analyse 2 months worth of data. |
Thanks @lgharib for the detailed explanation. Changed the subject and closing this issue now. |
Steps to reproduce:
Execute analysis pipeline using /e-mission-py.bash bin/debug/intake_single_user.py -e "[email protected]"
Expected result:
The pipeline executres corretrly and the data is analyzed and stored in the Staging_analysis_timeseries collection.
Actuel result:
(emission) root@e-mission-server-fabmob-qc:~/e-mission-server# ./e-mission-py.bash bin/debug/intake_single_user.py -e "[email protected]"
Connecting to database URL db
google maps key not configured, falling back to nominatim
nominatim not configured either, place decoding must happen on the client
transit stops query not configured, falling back to default
expectations.conf.json not configured, falling back to sample, default configuration
ERROR:root:habitica not configured, game functions not supported
Traceback (most recent call last):
File "/root/e-mission-server/emission/net/ext_service/habitica/proxy.py", line 22, in
key_file = open('conf/net/ext_service/habitica.json')
FileNotFoundError: [Errno 2] No such file or directory: 'conf/net/ext_service/habitica.json'
2021-12-07T11:19:40.559267+00:00UUID f5e92e19-39fa-4598-ad66-ebfc47cb9e34: moving to long term
2021-12-07T11:19:40.600720+00:00UUID f5e92e19-39fa-4598-ad66-ebfc47cb9e34: updating incoming user inputs
2021-12-07T11:19:40.886939+00:00UUID f5e92e19-39fa-4598-ad66-ebfc47cb9e34: filter accuracy if needed
2021-12-07T11:19:41.452110+00:00UUID f5e92e19-39fa-4598-ad66-ebfc47cb9e34: segmenting into trips
Found error curr_state.curr_run_ts = 1635335308.6691191 while processing pipeline for user f5e92e19-39fa-4598-ad66-ebfc47cb9e34, skipping
Traceback (most recent call last):
File "/root/e-mission-server/emission/pipeline/intake_stage.py", line 73, in run_intake_pipeline
run_intake_pipeline_for_user(uuid)
File "/root/e-mission-server/emission/pipeline/intake_stage.py", line 122, in run_intake_pipeline_for_user
eaist.segment_current_trips(uuid)
File "/root/e-mission-server/emission/analysis/intake/segmentation/trip_segmentation.py", line 52, in segment_current_trips
time_query = epq.get_time_range_for_segmentation(user_id)
File "/root/e-mission-server/emission/storage/pipeline_queries.py", line 55, in get_time_range_for_segmentation
return get_time_range_for_stage(user_id, ps.PipelineStages.TRIP_SEGMENTATION)
File "/root/e-mission-server/emission/storage/pipeline_queries.py", line 308, in get_time_range_for_stage
assert curr_state.curr_run_ts is None, "curr_state.curr_run_ts = %s" % curr_state.curr_run_ts
AssertionError: curr_state.curr_run_ts = 1635335308.6691191
The text was updated successfully, but these errors were encountered: