Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Legacy Libraries Migration (Prototype) #35758

Closed

Conversation

kdmccormick
Copy link
Member

@kdmccormick kdmccormick commented Nov 1, 2024

Part of #32457

What's ready for feedback

Firstly, the migration data models in openedx/core/djangoapps/content_libraries/models.py

  • Question: Is the content_libraries app the right place for these? Would they be better in contentstore, since they are CMS-focused?
  • Question: Alternatively, do we want to generalize these models so that they could be used as a basis for the course migration as well? Imagine, for example:
    # split_django_modulestore/models.py
    class SplitModulestoreIndexMigration(Model);
        source = ForeignKey(SplitModulestoreCourseIndex)  #  alegacy library for now, eventually courses too
        target = ForeignKey(LearningPackage)
        # Not sure how to capture collections in this version of the model.
        # Perhaps we'd want to supplement this model with the ContentLibraryMigration model.

The new migration Python API function, defined in openedx/core/djangoapps/content_libraries/migration_api.py.

  • If we move the models to contentstore or split_django_modulestore, we'd move this API function there as well.

Finally, the new upstream link computation for LegacyLibraryContentBlock children, which would allow them to seamlessly switch over to syncing from V2 Libraries once their source is migrated, without any backwards incompatibility, user intervention, or spooky background content changes. Defined incms/lib/xblock/upstream_sync.py and xmodule/library_content_block.py.

image

This LegacyLibraryContentBlock's source library has been migrated. You can see that the child is now considered to be "Sourced from a library" based on the icon. If I publish an edit to this problem in its new library, instead of seeing the legacy "Update Now" messaging from the LegacyLibraryContentBlock, I'd see the new "Update Available" button on the problem itself (I would show this but my authoring env is currently broken).

What's missing

  • Optimization. In the PR currently, rendering each LegacyLibraryContentBlock child requires looping through all of the source library's ContentLibraryBlockMigration objects in order to find a key match. This could easily be optimized with one or both of:
    • some request-local caching
    • a table where we persist upstream-downstream mappings (which I think we need anyway for Teak).
  • UI for migrating a legacy library, accessible to all library authors with the requisite access.
  • Shutting off write access to migrated legacy libraries, so that we don't end up with two divergent sources of truth for any library.
  • Shutting off the ability to reference migrated legacy libraries
  • Updating migrated LegacyLibraryContentBlocks' UI to match the ItemBankBlock's UI, since a migrated LLCB is essentially just an ItemBankBlock.

Hacky stuff in this PR, for demo purposes

This PR has an admin action on the split_django_modulestore index, and new migrate_legacy_library CMS management command. It could be good to clean one or both of these up and merge them into master so that developers can experiment with the migration. (This would not replace the need for a similar end-user-facing UI in Teak).

image

This PR also modifies the titles and URL targets of the Legacy Libraries home tab. Clicking on a migrated library will send you to the new library and/or collection within that library. For Teak, you could imagine a nicer-looking version of this, showing users which legacy libraries have been migrated where. Alternatively, we could just hide migrated legacy libraries from this list.

image

Sandbox env

I'm still getting Meili set up on this...

Link: https://studio.pr-35758-7daceb.sandboxes.opencraft.hosting/
UN/PW: openedx / openedx

Settings

TUTOR_GROVE_COMMON_SETTINGS : |
  SEARCH_ENGINE: "openedx.core.djangoapps.content.search.engine.MeilisearchEngine"

PLUGINS:
- mfe
- grove
- meilisearch
- s3
PLUGIN_INDEXES:
- https://overhang.io/tutor/main

Tutor requirements

git+https://github.com/overhangio/tutor.git@nightly
git+https://github.com/overhangio/tutor-mfe.git@nightly
git+https://gitlab.com/opencraft/dev/tutor-contrib-grove.git@4d4ce5c0e258c95c02abce17cc22138ce678fd22
git+https://github.com/hastexo/[email protected]
git+https://github.com/openedx/[email protected]#egg=tutor-contrib-harmony-plugin&subdirectory=tutor-contrib-harmony-plugin
git+https://github.com/open-craft/tutor-contrib-meilisearch.git@main

@kdmccormick kdmccormick force-pushed the kdmccormick/library-migration branch 2 times, most recently from 22c4880 to 6fb0955 Compare November 4, 2024 21:39
@kdmccormick kdmccormick changed the title feat: Backend for Migrating Legacy Libraries (wip) feat: Legacy Libraries Migration (Prototype) Nov 4, 2024
@kdmccormick kdmccormick force-pushed the kdmccormick/library-migration branch 2 times, most recently from 664021b to 4252351 Compare November 6, 2024 18:06
@kdmccormick kdmccormick added the create-sandbox open-craft-grove should create a sandbox environment from this PR label Nov 6, 2024
@open-craft-grove
Copy link

Sandbox deployment successful 🚀
🎓 LMS
📝 Studio
ℹ️ Grove Config, Tutor Config, Tutor Requirements

@kdmccormick kdmccormick force-pushed the kdmccormick/library-migration branch from 4252351 to 985a912 Compare November 6, 2024 19:26
# If so, then we know that this block was derived from block in a legacy (v1) content library.
# Try to get that block's migrated (v2) content library equivalent and use it as our upstream.
elif downstream.parent and downstream.parent.block_type == "library_content":
from xmodule.library_content_block import LegacyLibraryContentBlock
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: do we need this import here?

Copy link
Member Author

@kdmccormick kdmccormick Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the type annotation on the next line. Importing at the top of the file would be a cyclical import.

Alternatively, I could just # type: ignore the next line, but I'd rather avoid that.

EDIT: I think I could do this at the top of the file:

if t.TYPE_CHECKING:
    from xmodule.library_content_block import LegacyLibraryContentBlock

I'll try that when this PR is ready for actual review.

@Ian2012
Copy link
Contributor

Ian2012 commented Nov 7, 2024

I'm in favor of having an explicit table for a downstream - upstream mapping.

Question: Alternatively, do we want to generalize these models so that they could be used as a basis for the course migration as well? Imagine, for example:

As long as the relation is only a mapping between a source - target, I don't see any issues with that.

@kdmccormick kdmccormick force-pushed the kdmccormick/library-migration branch from 985a912 to 2d66c67 Compare November 7, 2024 15:37
@open-craft-grove
Copy link

Sandbox deployment successful 🚀
🎓 LMS
📝 Studio
ℹ️ Grove Config, Tutor Config, Tutor Requirements

Copy link
Contributor

@bradenmacdonald bradenmacdonald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kdmccormick Exciting! Thanks for the prototype here.

Question: Is the content_libraries app the right place for these? Would they be better in contentstore, since they are CMS-focused?

Where possible, I do like to keep legacy/transition code separate from new/permanent code. So I'd lean toward openedx/core/djangoapps/content_libraries/migration/* as the home. But split_modulestore_django or contentstore are also fine with me.

These models can be deleted as soon as v1 libraries are fully removed, right?

Alternatively, do we want to generalize these models so that they could be used as a basis for the course migration as well?

We're so far from being able to migrate courses that I don't think we know enough to say what the data model requirements will be. For example, I think we'll have a 1:1 mapping from courses to learning package, but that's not the case for libraries. So I think it's premature to try and share the code. I'm not opposed to putting the library migration tracking model in split_modulestore_django though.

This LegacyLibraryContentBlock's source library has been migrated. You can see that the child is now considered to be "Sourced from a library" based on the icon. If I publish an edit to this problem in its new library, instead of seeing the legacy "Update Now" messaging from the LegacyLibraryContentBlock, I'd see the new "Update Available" button on the problem itself (I would show this but my authoring env is currently broken).

Help me understand here. So, we're letting users migrate libraries one at a time. As they do, if they happen to navigate into a course where a LegacyLibraryContentBlock references that, then our "upstream detection" code will treat them the same way as children of Problem Banks (that is, they are downstream copies from an upstream v2 library). But we haven't actually modified the LegacyLibraryContentBlock nor its children yet at this stage. So what happens if I edit the settings of the LegacyLibraryContentBlock , say increasing the number of children to display. Or what if I want to choose a different (v2) library/collection or anything else. Will I have to delete the LegacyLibraryContentBlock and manually replace it with a Problem Bank?

I guess what I'm asking is if we should instead scan for relevant LegacyLibraryContentBlocks and convert them to Problem Banks as an explicit migration process instead of having to handle these various hybrid half-migrated situations as they arise.

Timing

Oh, and please don't merge this PR anytime soon, while we're still in bugfix mode for Sumac :)


# In order identify the new v2 library block, we need to know the v1 library block that this child came from.
# Unfortunately, there's no straightforward mapping from these children back to their v1 library source blocks.
# (ModuleStore does have a get_original_usage function that inspects edit_info, but we can't count on
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# (ModuleStore does have a get_original_usage function that inspects edit_info, but we can't count on
# (ModuleStore does have a get_block_original_usage function that inspects edit_info, but we can't count on

# (ModuleStore does have a get_original_usage function that inspects edit_info, but we can't count on
# that always working, particularly if this block's course was imported from another instance.)
# However, we can work around this by just looping through every block in the legacy library, and testing to see
# if it's our source block.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we at least start with get_block_original_usage and fall back to this loop if it doesn't work?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, but beware that at least one invocation of get_block_original_usage has cleanup code that implies it can pass back a versioned UsageKey:

orig_key, orig_version = self.runtime.modulestore.get_block_original_usage(usage_key)
return {
"usage_key": str(usage_key),
"original_usage_key": str(orig_key.replace(version=None, branch=None)) if orig_key else None,

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, and noted

Copy link
Contributor

@ormsbee ormsbee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really exciting stuff! 😁

"""
source_key = LearningContextKeyField(unique=True, max_length=255)
target = models.ForeignKey(ContentLibrary, on_delete=models.CASCADE)
target_collection = models.ForeignKey(Collection, on_delete=models.SET_NULL, null=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This model encodes some very reasonable lifecycle behavior that is probably worth explicitly calling out in the comments, (e.g. "deleting the v2 library deletes any record of the migration", "collection is nullable because you can migrate and then delete the target collection, without actually deleting the components in that collection", etc.).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can do


class ContentLibraryBlockMigration(models.Model):
"""
Record of a legacy (v1) content library block that has been migrated into a new (v) content library block.
Copy link
Contributor

@ormsbee ormsbee Nov 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Record of a legacy (v1) content library block that has been migrated into a new (v) content library block.
Record of a legacy (v1) content library block that has been migrated into a new (v2) content library block.

Comment on lines +263 to +264
block_type = models.SlugField()
source_block_id = models.SlugField()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine, but I'm curious why you didn't decide to use a UsageKeyField here instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps I was over-normalizing. I was trying make it impossible to encode three invalid cases:

  • The block type of the source doesn't make the block type of the target.
  • The LearningContexts of the source doesn't match library_migration.source_key
  • The LearningContexts of the target doesn't match library_migration.target

Sounds like it would be simpler to make the source a UsageKeyField and the target a Component, plus a mix of a db constraints and app-level validation to avoid those invalid cases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dave: Also, this is an edge case, but do we want to explicitly disallow having two separate v1 blocks map to the same v2 block?

Yes, I'll add that constraint.

if not self.source_library_id:
raise NoUpstream()
try:
source_library_key = self.source_library_key
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I think that properties that can throw exceptions like this are foot-guns waiting to happen. I realize that the source_library_key property method precedes this PR, so take this as a purely optional item.

# (ModuleStore does have a get_original_usage function that inspects edit_info, but we can't count on
# that always working, particularly if this block's course was imported from another instance.)
# However, we can work around this by just looping through every block in the legacy library, and testing to see
# if it's our source block.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, but beware that at least one invocation of get_block_original_usage has cleanup code that implies it can pass back a versioned UsageKey:

orig_key, orig_version = self.runtime.modulestore.get_block_original_usage(usage_key)
return {
"usage_key": str(usage_key),
"original_usage_key": str(orig_key.replace(version=None, branch=None)) if orig_key else None,

@kdmccormick
Copy link
Member Author

kdmccormick commented Jan 8, 2025

Thanks all for your feedback. To avoid spamming you, I'm going to close this PR and re-open a new one (which I will factor your feedback into). I'd like to implement this on top of the upstream-downstream link models that @navinkarkera is taking the lead on currently, so the migration blocked by: openedx/modular-learning#242

@kdmccormick kdmccormick closed this Jan 8, 2025
@kdmccormick kdmccormick deleted the kdmccormick/library-migration branch January 8, 2025 20:56
@kdmccormick kdmccormick restored the kdmccormick/library-migration branch January 8, 2025 20:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
create-sandbox open-craft-grove should create a sandbox environment from this PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants