Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Patch off object (again) #146

Merged
merged 20 commits into from
Jan 23, 2025

Conversation

scottcanoe
Copy link
Contributor

@scottcanoe scottcanoe commented Jan 20, 2025

Problem Statement

Note: This PR has been tested against the long benchmark experiments, but I haven't run the whole suite. It has also been tested against with the experiments that revealed the problem. I've left it as a draft since changes like these usually require a complete re-benchmarking.

My previous attempt at fixing patch_off_object episodes was flawed. Those modifications were sufficient to prevent patch_off_object episodes for all of the benchmark experiments, but I started (rarely) getting patch_off_object when running longer experiments with more random rotations.

The primary change to Monty in the last patch_off_object PR was to add get_good_view_with_patch_refinement. The idea is the take a first pass at finding a good view using the view finder and then, if the central patch wasn't centered on the target object, do another pass of get_good_view using the central patch. The idea was to make finer adjustments that would make sure the central patch was on-object at the start of the episode. Here's where it failed:

  • The function that computes how much the agent should in order to have the sensor land on the targeted location did not account for the current agent/sensor rotation. It assumed the agent and sensor have not yet been rotated, and so it always works fine for the first call to orient_to_object. The fact that the last PR still "worked" on benchmark experiments without having fixed this is pretty surprising, but the incorrect behavior is clear once you collect all the pre-episode observations and actions visualize them (which is a little tricky since Monty's built-in loggers aren't set up for this).

  • Looking up/down and then left/right is not the same as pitching and yawing to a target simultaneously. This is partly why attempting to orient through a multi-action procedure (down-then-left) didn't always land the sensor on the object the first time.

Solutions

  • Fix InformedPolicy.compute_look_amounts so that it returns movements relative to the current agent/sensor rotation.
  • Modify InformedEnvironmentDataLoader.get_good_view to allow multiple orienting attempts, similar to how it can perform multiple move-forward actions.
  • Some functions, especially in InformedPolicy, needed some refactoring to make the multiple-reorienting steps doable in a relatively clean way.

Smaller Changes

  • I changed the name of parameter view_sensor_id to sensor_id in several InformedPolicy functions. The naming of view_sensor_id sounds like a holdover from when the view finder was the only sensor to use these functions.
  • The view finder was also hard-coded into find_location_to_look_at, and that's been changed as well.

Copy link

github-actions bot commented Jan 20, 2025

📚 Documentation Preview

✅ A preview of the documentation changes in this PR is available for maintainers at:
https://thousandbrainsproject.readme.io/v0.0-scottcanoe-patch-off-object

Last updated: 2025-01-23 18:47 UTC

@scottcanoe scottcanoe marked this pull request as ready for review January 20, 2025 23:16
@scottcanoe scottcanoe changed the title WIP: Patch off object (again) Patch off object (again) Jan 21, 2025
@scottcanoe scottcanoe added the bug Something isn't working label Jan 21, 2025
Copy link
Contributor

@vkakerbeck vkakerbeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow very nice! You really dug deep there. This solution is nice and hopefully it all works now. Also nice additional refactors and renaming :)

if not on_target_object:
on_target_object = self.is_on_target_object(sensor_id)
num_attempts = 0
while not on_target_object and num_attempts < max_orientation_attempts:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we raise and error or warning if max_orientation_attempts is reached?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The calling function (DataLoader.pre_episode) will raise the error when on_target_object is returned as False.

Copy link
Contributor

@nielsleadholm nielsleadholm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, super thorough and nicely documented, thanks for digging into this!

Just some small suggestions on the refactor.

src/tbp/monty/frameworks/environments/embodied_data.py Outdated Show resolved Hide resolved
src/tbp/monty/frameworks/environments/embodied_data.py Outdated Show resolved Hide resolved
@tristanls tristanls added triaged This issue or pull request was triaged and removed bug Something isn't working labels Jan 21, 2025
@tristanls
Copy link
Contributor

tristanls commented Jan 21, 2025

Hi @scottcanoe. I removed the bug label during triage. I've been putting a bug label on Issues to identify potential bugs. However, I feel that this being a PR, it is a fix or solution to a bug, not a bug itself. If you'd rather keep the bug label on PRs marking them as a bug fix, please re-add it, and I'll leave it alone.

@scottcanoe
Copy link
Contributor Author

Thanks @vkakerbeck @nielsleadholm and @tristanls for the review, great comments and suggestions as always.

Since you last saw it, I reworked compute_look_amounts, and it is now completely dead-on. Between the benchmarks and DMC experiments, I ran close to 10k episodes, and not a single one actually required multiple orientation steps with the patch. So I guess the question is, should we leave in the new multiple attempts machinery? I'm fine leaving it in just in case some super rare edge case comes up where its necessary, but other than that, I'm good to merge.

At some point in the future, I plan to contribute some documentation about the coordinate system(s) in use. I learned a lot working on this.

Copy link
Contributor

@vkakerbeck vkakerbeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, this is awesome! I like how you rewrote the compute_look_amounts function. Looks like you really got a deep understanding of all the spatial transformations and different reference frames (which I've been banging my head against numerous times so I know its not an easy feat!) A writeup (in another PR) of some of the insights you had while doing that would be great! We started writing up a couple of conventions that helped us debug before here: https://thousandbrainsproject.readme.io/docs/common-issues-and-how-to-fix-them but feel free to start a new page just focused on this.

Just added one note on the benchmark table where it looks like the format may be a bit off but otherwise I'm happy with merging this.

docs/overview/benchmark-experiments.md Outdated Show resolved Hide resolved
Copy link
Contributor

@nielsleadholm nielsleadholm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah awesome work, thanks for going even deeper on this problem. Just a clarifying comment.

to the agent's current position and rotation.

TODO: Test whether this function works when the agent is facing in the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to give a bit of context for this TODO, e.g. why/when it is expected things might fail?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, the thing I have in mind is in regards to arctan2 and the discontinuity at 180 degrees. It may work totally fine when looking down positive z, but since it took some trial and error to work out the conversion to angles in body coordinates with the positive z-axis pointing opposite from the body's rotation, I wanted to leave a note in case there's some debugging in the future.

I'll make a plan to add some unit tests for this function. I'll have to work out how to initialize the agent and environment to test different scenarios, but it's pretty doable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also adding what I just said to the docstring.

@scottcanoe scottcanoe merged commit caf2ef4 into thousandbrainsproject:main Jan 23, 2025
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triaged This issue or pull request was triaged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants