-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Patch off object (again) #146
Patch off object (again) #146
Conversation
📚 Documentation Preview ✅ A preview of the documentation changes in this PR is available for maintainers at: Last updated: 2025-01-23 18:47 UTC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow very nice! You really dug deep there. This solution is nice and hopefully it all works now. Also nice additional refactors and renaming :)
if not on_target_object: | ||
on_target_object = self.is_on_target_object(sensor_id) | ||
num_attempts = 0 | ||
while not on_target_object and num_attempts < max_orientation_attempts: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we raise and error or warning if max_orientation_attempts is reached?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The calling function (DataLoader.pre_episode) will raise the error when on_target_object
is returned as False
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, super thorough and nicely documented, thanks for digging into this!
Just some small suggestions on the refactor.
Hi @scottcanoe. I removed the |
Thanks @vkakerbeck @nielsleadholm and @tristanls for the review, great comments and suggestions as always. Since you last saw it, I reworked At some point in the future, I plan to contribute some documentation about the coordinate system(s) in use. I learned a lot working on this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, this is awesome! I like how you rewrote the compute_look_amounts
function. Looks like you really got a deep understanding of all the spatial transformations and different reference frames (which I've been banging my head against numerous times so I know its not an easy feat!) A writeup (in another PR) of some of the insights you had while doing that would be great! We started writing up a couple of conventions that helped us debug before here: https://thousandbrainsproject.readme.io/docs/common-issues-and-how-to-fix-them but feel free to start a new page just focused on this.
Just added one note on the benchmark table where it looks like the format may be a bit off but otherwise I'm happy with merging this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah awesome work, thanks for going even deeper on this problem. Just a clarifying comment.
to the agent's current position and rotation. | ||
|
||
TODO: Test whether this function works when the agent is facing in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to give a bit of context for this TODO, e.g. why/when it is expected things might fail?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, the thing I have in mind is in regards to arctan2 and the discontinuity at 180 degrees. It may work totally fine when looking down positive z, but since it took some trial and error to work out the conversion to angles in body coordinates with the positive z-axis pointing opposite from the body's rotation, I wanted to leave a note in case there's some debugging in the future.
I'll make a plan to add some unit tests for this function. I'll have to work out how to initialize the agent and environment to test different scenarios, but it's pretty doable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also adding what I just said to the docstring.
Problem Statement
Note: This PR has been tested against the long benchmark experiments, but I haven't run the whole suite. It has also been tested against with the experiments that revealed the problem. I've left it as a draft since changes like these usually require a complete re-benchmarking.
My previous attempt at fixing
patch_off_object
episodes was flawed. Those modifications were sufficient to preventpatch_off_object
episodes for all of the benchmark experiments, but I started (rarely) gettingpatch_off_object
when running longer experiments with more random rotations.The primary change to Monty in the last
patch_off_object
PR was to addget_good_view_with_patch_refinement
. The idea is the take a first pass at finding a good view using the view finder and then, if the central patch wasn't centered on the target object, do another pass ofget_good_view
using the central patch. The idea was to make finer adjustments that would make sure the central patch was on-object at the start of the episode. Here's where it failed:The function that computes how much the agent should in order to have the sensor land on the targeted location did not account for the current agent/sensor rotation. It assumed the agent and sensor have not yet been rotated, and so it always works fine for the first call to
orient_to_object
. The fact that the last PR still "worked" on benchmark experiments without having fixed this is pretty surprising, but the incorrect behavior is clear once you collect all the pre-episode observations and actions visualize them (which is a little tricky since Monty's built-in loggers aren't set up for this).Looking up/down and then left/right is not the same as pitching and yawing to a target simultaneously. This is partly why attempting to orient through a multi-action procedure (down-then-left) didn't always land the sensor on the object the first time.
Solutions
InformedPolicy.compute_look_amounts
so that it returns movements relative to the current agent/sensor rotation.InformedEnvironmentDataLoader.get_good_view
to allow multiple orienting attempts, similar to how it can perform multiple move-forward actions.InformedPolicy
, needed some refactoring to make the multiple-reorienting steps doable in a relatively clean way.Smaller Changes
view_sensor_id
tosensor_id
in severalInformedPolicy
functions. The naming ofview_sensor_id
sounds like a holdover from when the view finder was the only sensor to use these functions.find_location_to_look_at
, and that's been changed as well.