-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
only pretraining comparisons appear in the labeling interface #36
Comments
What is the URL for the key that does not exist? Perhaps your |
I'm not an expert in I added the following hack and it seems to fix the problem. But I think
But I'm not exactly clear how RL with human feedback is supposed to work. I'm running the experiments on an old MacBook Pro, so the availability of recorded video is always behind the latest comparison as shown by whats uploading in the logfile. I give feedback on 3-5 comparisons, then come back 10-20 mins later for the next batch. But it seems to me that the most recent comparison/video segments have the benefit of more Q-learning––and rating these comparisons would have a greater learning benefit. If I only provide feedback on a few comparisons every 20 mins, would I get better results by giving feedback for the most recent ones? Does the learning algorithm still work if I offer sparse feedback, or do I need to provide feedback for every comparison? if |
I got to this point following the
RL-teacher
Usage docsI was able to use the
human-feedback-api webapp
to provide feedback for the 175 pre-training labels. After that, theagent
began to learn based on the pre-training feedbackBut joint training failed. The
human-feedback-api webapp
displayed only blank screens. When I checked the URL for the videos in a separate tab, I got an XML error message that saidThe specified key does not exist
At the same time, the
teacher.py
script continued to generate video samples and upload to GoogleCloudI can manually confirm that the media files exist in Google Cloud
I waited many minutes, refreshed the webapp, even clicked
can't tell
a few times, but the video never reappeared after the (successful) pre-training.The text was updated successfully, but these errors were encountered: