Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pip install: Fetch platform-independent data #15774

Closed
Tracked by #15954
BetsyMcPhail opened this issue Sep 14, 2021 · 14 comments · Fixed by #19046
Closed
Tracked by #15954

pip install: Fetch platform-independent data #15774

BetsyMcPhail opened this issue Sep 14, 2021 · 14 comments · Fixed by #19046
Assignees
Labels
component: distribution Nightly binaries, monthly releases, docker, installation priority: medium

Comments

@BetsyMcPhail
Copy link
Contributor

Working towards #1183

Follow up to #15628

@BetsyMcPhail BetsyMcPhail added component: distribution Nightly binaries, monthly releases, docker, installation unused team: kitware labels Sep 14, 2021
@BetsyMcPhail BetsyMcPhail self-assigned this Sep 14, 2021
@jwnimmer-tri
Copy link
Collaborator

To make sure that we all understand this ticket -- the victory condition here is that a user of pip install drake is able to easily load the models & meshes for all of Drake's included robots and manipulands, e.g., the iiwa, atlas, ycb objects, etc., with the expectation that under the hood we're fetching them from urls. (At the moment, we've had to exclude certain model files from the whl because they are too large.)

Is that accurate?

@BetsyMcPhail
Copy link
Contributor Author

The exact implementation details still need to be worked out but that is my understanding

@jwnimmer-tri

This comment has been minimized.

@jwnimmer-tri
Copy link
Collaborator

jwnimmer-tri commented Nov 3, 2021

Relates to #13942 and #15024 somewhat, and #9498 and #11913 more directly.

Another comment (channeling Russ) -- something like https://pytorch.org/tutorials/beginner/basics/data_tutorial.html may be the best solution here. Either that library directly, or some Drake-compatible simpler implementation.

@jwnimmer-tri
Copy link
Collaborator

@jwnimmer-tri
Copy link
Collaborator

For better issue search, here's the list of mesh file paths that are excluded from the wheel:

rm -rf \
    ${WHEEL_DATA_DIR}/manipulation/models/franka_description/meshes \
    ${WHEEL_DATA_DIR}/manipulation/models/tri-homecart/*.obj \
    ${WHEEL_DATA_DIR}/manipulation/models/tri-homecart/*.png \
    ${WHEEL_DATA_DIR}/manipulation/models/ur3e/*.obj \
    ${WHEEL_DATA_DIR}/manipulation/models/ur3e/*.png \
    ${WHEEL_DATA_DIR}/manipulation/models/ycb/meshes \
    ${WHEEL_DATA_DIR}/examples/atlas \
    ${WHEEL_DATA_DIR}/examples/hydroelastic/spatula_slip_control

rm -rf \
${WHEEL_DATA_DIR}/manipulation/models/franka_description/meshes \
${WHEEL_DATA_DIR}/manipulation/models/tri-homecart/*.obj \
${WHEEL_DATA_DIR}/manipulation/models/tri-homecart/*.png \
${WHEEL_DATA_DIR}/manipulation/models/ur3e/*.obj \
${WHEEL_DATA_DIR}/manipulation/models/ur3e/*.png \
${WHEEL_DATA_DIR}/manipulation/models/ycb/meshes \
${WHEEL_DATA_DIR}/examples/atlas \
${WHEEL_DATA_DIR}/examples/hydroelastic/spatula_slip_control

@SwappyG
Copy link

SwappyG commented Sep 28, 2022

I'm running into this issue when trying to load models (@jwnimmer-tri redirected me to here from my stackoverflow question. I'd be happy to help with making a PR for this issue if it's not actively being worked on.

@jwnimmer-tri
Copy link
Collaborator

@SwappyG thanks for your interest!

This feature will require a relatively large chain of multiple pull requests (some of which will be quite intricate), as well as corresponding changes to Drake's release process and binary distribution architecture.

I don't say that to discourage you, rather to say that it will be a somewhat involved process, and therefore a somewhat difficult starting place for a first-time contributor, especially with the overall software design for this feature not yet finalized.

My thought here is that I'm going to work on writing up a software design for how this is all supposed to work. I'll probably also need to hack together a prototype to show that the design is practical. I'll post those ideas into the ticket here, at which point anyone interested is welcome to help push it forward.

@jwnimmer-tri
Copy link
Collaborator

jwnimmer-tri commented Oct 11, 2022

Here's my thinking towards a design...

Background:

Proposal:

(1) Step away from the forward_files idea, by adding a package://drake_models.

Add package.xml to https://github.com/RobotLocomotion/models. Update our SDFormat/URDF references to cite the new package, e.g., drake/manipulation/models/ycb/sdf/003_cracker_box.sdf would change from citing <uri>package://drake/manipulation/models/ycb/meshes/003_cracker_box_textured.obj</uri> to <uri>package://drake_models/ycb/meshes/003_cracker_box_textured.obj</uri>.

(2) Add the drake_models package to our PackageMap by default (i.e., in addition to the drake package), so that all of the models continue to load out-of-the-box.

(3) Stop incorporating any of the models repository into Drake's install rules, thereby effectively removing those files from our pre-compiled binaries as well.

(4) In source builds, the PackageMap entry for drake_models will refer to the download fetched by bazel (i.e., basically no change from today).

(5) In pre-compiled builds, the PackageMap entry for drake_models will map to a URI, instead of a filesystem path. The first time any model file is requested, the URI will be downloaded into a temporary folder and re-used from that point on. Users could add entries for other URLs also -- in case they want to load models from web servers as well.

(6) Maybe docker images could have the drake_models data already pre-fetched on disk? Or maybe easiest to just keep them the same as everywhere else. In any case the pip and tgz will drop the model data; probably apt as well.

(7) We could have a "prefetch" post-install script that any user could run from any install mechanism, to download the models and place them somewhere the package map would find them by default, with no ongoing downloading.

(8) Some users might balk at the idea of Drake hitting the internet by default. The download will at least need to have an opt-out config setting; possibly it needs to be opt-in. Possibly we could obey the default environment variable for proxying (http_proxy IIRC).

Miscellany:

(a) Drake already build-depends on libcurl; we would use that for the downloading. We might need to activate its https support. (IIRC, we are http only so far.)

(b) I'm not sure whether the package map URL should refer to a base url (where we could grab files one by one) or an archive (that we'd download all at once and then decompress).

(c) Should the downloads rely on https to certify the file, or a sha256 (or 512) checksum, or both?

(d) We need a careful mechanism to keep the model repository pinned and mirrored, possibly with updates to the release playbook.

(e) Anything in models will no longer be accessible to FindResourceOrThrow. That means if we move e.g. the IIWA URDFs to models as proposed in #13942, the only way to load them will be as URIs from the package map, not as bazel resources. This is probably better in any case. Users are confused by FindResource stuff.

@SwappyG
Copy link

SwappyG commented Oct 17, 2022

@jwnimmer-tri I think I understand most of the proposal. Making the models repo into a package xml + changing URIs to point to that seems like the lowest risk task for a first-time contribution. Does that seem accurate?

(1), and maybe (2), (5) and (7) seem like something I could help with, though I don't know too much about certifying files as mentioned in (c).

@RussTedrake
Copy link
Contributor

Related to the discussion above, I had at least one very reasonable question about this from a student: "I'd be happy to just download the model repo separately into a subdirectory of Drake (e.g. after a pip install). Why do you make it so hard to just grab the files? Your bazel script puts them everywhere."

@jwnimmer-tri
Copy link
Collaborator

I'd be happy to just download the model repo separately into a subdirectory of Drake (e.g. after a pip install).

The https://drake.mit.edu/from_binary.html downloads have all of the model files, in their expected relative locations.

@jwnimmer-tri
Copy link
Collaborator

jwnimmer-tri commented Mar 1, 2023

For my reference:

See $XDG_CACHE_HOME per https://wiki.archlinux.org/title/XDG_Base_Directory, for our temporary downloads.

Per pypa/packaging-problems#64, it seems like there is not any way to populate the cache at install-time, only upon first use.

@jwnimmer-tri
Copy link
Collaborator

jwnimmer-tri commented Mar 13, 2023

A few thoughts from f2f chat today:

  • If we yank the models out of our deb releases, consider adding a separate deb that users can install with the models, to avoid fetching in that case.

=> #19079

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: distribution Nightly binaries, monthly releases, docker, installation priority: medium
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

4 participants