Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changes to webapp to allow debugging of upload problems and providing feedback out of sequence #39

Closed
wants to merge 4 commits into from

Conversation

mixuala
Copy link

@mixuala mixuala commented Dec 7, 2017

I'm still having problems getting the agent to complete a learning task from human feedback. right now I'm stuck on #38 which keeps me from offering feedback after about 3000 sec.

But I made a few changes to the webapp to make it easier to debug uploads from the webapp, and also to provide feedback out of sequence.

  • add /experiments/[name]/[comparison_id] for offering feedback out of sequence,
  • add media UUID to allow checking upload completion using GCP console
  • changed order_by from "-created_at" to Comparision.order_by("+id") for comparisons because learning phase was uploading media using Comparison.order_by("+created_at")

… of sequence,

- add media UUID to allow checking upload completion using GCP console
- changed order_by from "-created_at" to Comparision.order_by("+id")  for comparisons because learning phase was uploading media using Comparison.order_by("+created_at")
…server

- skips upload Google Cloud Platform
add nav for browsing comparisons
@mixuala
Copy link
Author

mixuala commented Dec 13, 2017

I added a new args param to serve videos from the local server. By skipping the Google Cloud uploads, I was able to complete a training run with 600 feedbacks before my video rendering processes died.

…learner separately; continue training reward predictor after loading weights; and server content locally (skip GCP)

—load_predictor_weights [walker-splits-3-20_weights.150.h5]
—train_predictor [0,1]
—load_policy_weights [walker-splits-4-0.policy_weights.120.ckpt]
--content_server [local]

E.g. python rl_teacher/teach.py -p human -e $RL_ENV -n $RL_NAME-$RL_LABELS   --content_server local   --load_predictor_weights walker-splits-3-20_weights.150.h5   --load_policy_weights walker-splits-4-0.policy_weights.120.ckpt   --train_predictor 1
@nottombrown
Copy link
Owner

Hey @mixuala, I'd like to keep this version of the code simple and not add new features to it.

However, I'd bet that many people would find your code useful to them. Perhaps you could put together a description of the fork that you're working on, and then we can link to it to the Extensions section of the README

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants