Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2u/course optimizer #35887

Open
wants to merge 49 commits into
base: master
Choose a base branch
from
Open

2u/course optimizer #35887

wants to merge 49 commits into from

Conversation

rayzhou-bit
Copy link
Contributor

@rayzhou-bit rayzhou-bit commented Nov 20, 2024

Description

This PR creates the backend for Course Optimizer Link Checker, which will scan through a published course and check for broken links. This functionality imitates what is currently in place for export. 2 apis are created here:

link_check POST

  • Queues a task to start the link check process through the celery queue.
  • Results for the broken links scan is stored as a list of tuples: [block_id, broken_link]

link_check_status GET

  • Returns the status of the link check process.
  • Returns results of link_check if process is successful.
  • Result Data Transfer Object returns broken links along with relevant ancestor data for the block they are found in.

Technical considerations:

  • The results of link check scan is currently saved as a UserTaskArtifact file. While this is the simplest for implementation, arguments can be made to save this data in tables instead.
  • Benefits for using UserTaskArtifact file: Easy implementation as this mimics the current export functionality.
  • Benefits for using a database table: Good foundation for accessing thinner slices of data for broken links. While not needed for the current functionality being developed, it could be useful for future updates. For example, authors could be notified on the broken links of a quiz a couple of days before learners take the quiz. Another example is it would be easier to analyze data such as finding the average number of broken links per course.

Supporting information

https://2u-internal.atlassian.net/browse/TNL-11782

Testing instructions with frontend PR

  1. Make sure you're you have the frontend code in frontend-app-authoring: Feat course optimizer page frontend-app-authoring#1533
  2. In devstack, run make dev.up.large-and-slow.
  3. In frontend-app-authoring, run npm start.
  4. Enable waffleflag contentstore.enable_course_optimizer.
  5. Navigate to the Course Optimizer page by going to the Tools dropdown menu and selecting the Optimize Course option.
  6. Click Start Scanning to run a scan of your course. Any broken links will be delayed in the Broken Links Scan section when the scan completes.
  7. Navigate directly to the blocks with the broken links through the links.
  8. Update and publish your course with broken links. Scan again to see these new entries in the Broken Links Scan section.

Testing instructions without frontend PR

The following example is for demo course course-v1:edX+DemoX+Demo_Course.

  1. Find and copy the curl for an export call in your local environment.
curl 'http://localhost:18010/export/course-v1:edX+DemoX+Demo_Course' \
  -X 'POST' \
  -H 'Accept: application/json, text/javascript, */*; q=0.01' \
  -H 'Accept-Language: en-US,en;q=0.9' \
  -H 'Cache-Control: no-cache' \
  -H 'Connection: keep-alive' \
  -H 'Content-Length: 0' \
  ...
  1. Replace export with link_check.
  2. Make this call in the terminal.
  3. This should return the following if successful.
{
  "LinkCheckStatus": 1
}
  1. Access http://localhost:18010/link_check_status/course-v1:edX+DemoX+Demo_Course in your browser. You should see the results of the link check scan.
{
  "LinkCheckStatus": "Succeeded",
  "LinkCheckCreatedAt": "2025-01-14T19:36:53.178488Z",
  "LinkCheckOutput": {
"sections": [
      {
        "id": "d8a6192ade314473a78242dfeedfbf5b",
        "displayName": "Introduction",
        "subsections": [
          {
            "id": "edx_introduction",
            "displayName": "Demo Course Overview",
            "units": [
              {
                "id": "vertical_0270f6de40fc",
                "displayName": "Introduction: Video and Sequences",
                "blocks": [
                  {
                    "id": "030e35c4756a4ddc8d40b95fbbfff4d4",
                    "displayName": "Blank HTML Page",
                    "url": "/course/course-v1:edX+DemoX+Demo_Course/editor/html/block-v1:edX+DemoX+Demo_Course+type@html+block@030e35c4756a4ddc8d40b95fbbfff4d4",
                    "brokenLinks": [
                      "/definitely.does.notwork",
                      "/block-v1:edX+DemoX+Demo_Course+type@vertical+block@2152d4a4aadc4cb0af5256394a3d1fc7",
                      "https://testing123.whatever",
                      "google.com",
                      "/container/block-v1:edX+DemoX+Demo_Course+type@vertical+block@2152d4a4aadc4cb0af5256394a3d1fc7",
                      "block-v1:edX+DemoX+Demo_Course+type@vertical+block@2152d4a4aadc4cb0af5256394a3d1fc7"
                    ],
                    "lockedLinks": [
                      "/assets/courseware/v1/506da5d6f866e8f0be44c5df8b6e6b2a/asset-v1:edX+DemoX+Demo_Course+type@asset+block/getting-started_x250.png",
                      "/assets/courseware/v1/506da5d6f866e8f0be44c5df8b6e6b2a/asset-v1:edX+DemoX+Demo_Course+type@asset+block/getting-started_x250.png"
                    ...

Other information

@rayzhou-bit rayzhou-bit marked this pull request as draft November 20, 2024 00:23
@rayzhou-bit
Copy link
Contributor Author

@bszabo I updated a lot of the organization in tasks.py. I agree with you on the iffy code practices (using max / min / integer for status), but this is currently how UserTaskStatus is used and I feel it's better to follow it for now.

@rayzhou-bit rayzhou-bit requested a review from bszabo November 21, 2024 18:35
@bszabo
Copy link
Contributor

bszabo commented Nov 21, 2024

Thanks for the editorial changes, Ray. I'm stepping away from this review with the expectation that Jesper will give it a lookover from a functional perspective. If it's possible to attend to the funky status definition before moving on to new things, I would strongly recommend that, even if it ends up being in a different PR.

@bszabo
Copy link
Contributor

bszabo commented Nov 21, 2024

If you take a step back, and look at this search for broken links as a first installment towards course optimization, you can see that course optimization will entail a sequence of activities being carried out, with each intended to potentially improvew a course. Viewed that way, the natural questions to ask will be "which activity is currently being worked on?" and "what is the status for activity X?". For the latter question the natural answers will be not started, in progress, succeeded, or failed with error message Y.

It seems to me that it would make sense to organize even this first installment somewhat in those lines. The solution you borrowed from import/export is conflating concepts in a way I don't think is good.

@rayzhou-bit rayzhou-bit marked this pull request as ready for review February 3, 2025 21:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants