Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Develop model to label replacement modes when not filled out by user #783

Open
zackAemmer opened this issue Aug 25, 2022 · 2 comments
Open

Comments

@zackAemmer
Copy link

zackAemmer commented Aug 25, 2022

TODO:

  • Test models in notebook to come up with some reasonably accurate solutions.
  • Add a new model class similar to "TripModel" which expects aggregate data (instead of individual user ids).
  • Implement the aggregate model in the data processing pipeline, saving the model parameters periodically.
  • Use the model to periodically fill in replacement labels for users who do not label them in their trips.

Model Storage:
e-mission/e-mission-server#874

Model Structure:
e-mission/e-mission-server#852

@zackAemmer
Copy link
Author

zackAemmer commented Aug 25, 2022

Currently have tested Random Forest, Multinomial/Mixed Logit, and Gradient Boosted Decision Tree models for replacement labeling on all labeled data:
https://github.com/zackAemmer/em-public-dashboard/blob/trb-analysis/viz_scripts/biogeme_test.ipynb

Performance is about the same for the MNL/MXL and the Gradient Boosted Decision Tree ~75% accuracy across all classes, ~71% F1 score when weighted by class support. Given the simplicity of the implementation for the Sklearn models (GBDT/RF) it makes the most sense to implement those in the pipeline, while keeping things modular enough to plug in other models such as the MNL/MXL in the future.

This notebook also contains many "replacement" visualizations that were used in the TRB paper. These may be useful for the dashboard when it is configured in "program" (not yet implemented #781).

@zackAemmer zackAemmer moved this to Current week sprint in OpenPATH Tasks Overview Sep 1, 2022
@zackAemmer zackAemmer moved this to Current two week sprint in OpenPATH Tasks Overview Dec 1, 2022
@zackAemmer
Copy link
Author

zackAemmer commented Dec 1, 2022

The testing notebook is now here:
e-mission/em-public-dashboard#76

And the implementation is here:
e-mission/e-mission-server#890

Both are work in progress, but mostly on the implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

No branches or pull requests

1 participant