Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make induced_slot recursive #332

Draft
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

sneakers-the-rat
Copy link
Contributor

Related to: linkml/linkml#2219
and all the other times i gone and talked about making schemaview recursive.

Two main things we're trying to do here:

  • make it go faster
  • make it be more predictable

THere are a lot of places where schemaview will iterate over the whole schema in some way, sometimes in nested ways. We get a lot of mileage out of caching, but it also places some functional barriers in front of us where nonlocal effects become hard to diagnose, things get very bound up together, and implementing stuff like structured imports where we have to be able to handle lots of layers of different schemas with long inheritance chains gets v hard to do.

induced_slot is like my white whale, it takes an outsized amount of time because of the amount of checking up and down inheritance trees needs to get done, and it's also a critical stepping stone that we are sure is rock solid in order to make it never be in doubt whether we're looking at "the right" model class/etc. It's currently in some exponential time complexity state because for each slot for each class one needs to check the entire inheritance tree.

This is a start in the direction of a schemaview that only looks one step out at a time, recursively, so each step can be simpler. There are some bugs that i caused, and there are also some bugs that i think i am revealing (but have to figure out what they mean first), so it's not ready for review, but opening this as a draft. The general strategy is just that - to only look at the immediate parents of slots and classes so each slot/class combination is induced exactly once. in doing so, i am trying to keep each object as minimal as possible any only touch what is defined at each stage, but some of the methods of doing so are a bit costly, and i also am not sure about where to put the mutation guards yet so there are some missed/unnecessary copies done, but that's all tbd.

anyway here ya go, will return later.

perf status

current state of linkml and linkml-runtime (run on all non-slow tests, so the different would probably be greater since in the slow tests is where it gets really expensive)

Screenshot 2024-07-24 at 1 51 43 AM

this pr:

Screenshot 2024-07-24 at 1 51 52 AM

sort of weird result to me that there is 3.1s total time spent in the body of the function but snakeviz is showing 40s all collected there, probably just a visualization bug tho

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant