Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: add courserun date validation #3400

Open
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

Anas12091101
Copy link
Contributor

@Anas12091101 Anas12091101 commented Feb 11, 2025

What are the relevant tickets?

https://github.com/mitodl/hq/issues/5823

Description (What does it do?)

This PR adds the following date validation at course run creation:

  • Either start_date or enrollment_end should be present.
  • Either start_date or enrollment_end must be in the future.
  • End date must be later than start date.

Note: This PR also updates the tests which started to break due to validation changes

How can this be tested?

  • Creating a Course Run via Django Admin:

    • Try creating a course run without any dates → You will get this error:
      Either start_date or enrollment_end must be provided.
    • Add a start_date or enrollment_end in the past → You will get this error:
      Either start_date or enrollment_end must be in the future.
    • Set the start_date to a future date and end_date before the start_date → You will get this error:
      End date must be later than start date.
    • Set the start_date to a future date and end_date after start_date → The course run will be created successfully.
    • Open the newly created course run, remove the start_date and set the enrollment_end to a future date → The course run will be updated successfully.
  • For external courses:

    • The API provides course run data, some of which may have a past start date.
    • Run the task_sync_external_course_runs from shell. Now, two scenarios can occur:
      • Course Run already exists in the system:
        • If the data from the API matches the existing data, no update is needed.
        • If the data from the API differs, the task will update it.
          - If the API data meets the validation rules, the Course Run will be updated.
          - Otherwise, the following error will be logged:
          Error updating course run for course: course-v1:xPRO+MCPO, course run code: MXP-MCPO-24-12#1, error: e
      • Course Run doesn't exist in the system and the data from the API fails to satisfy the validations. In this case, Error creating course run for course: course-v1:xPRO+MCPO, course run code: MXP-MCPO-24-12#1, error: e will be logged.
    • Verify that the task doesn't break for all the cases.

@Anas12091101 Anas12091101 force-pushed the anas/add-courserun-date-validation branch from 29f8bbf to 90185d0 Compare February 13, 2025 11:32
@Anas12091101 Anas12091101 force-pushed the anas/add-courserun-date-validation branch from ab801c5 to 7e0b918 Compare February 13, 2025 12:46
@arslanashraf7 arslanashraf7 changed the title Anas/add courserun date validation fix: add courserun date validation Feb 14, 2025
not self.enrollment_end or self.enrollment_end < now
):
raise ValidationError(
"Either start_date or enrollment_end must be in the future."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"Either start_date or enrollment_end must be in the future."
"Either start date or enrollment end must be in the future."


if not self.start_date and not self.enrollment_end:
raise ValidationError(
"Either start_date or enrollment_end must be provided."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"Either start_date or enrollment_end must be provided."
"Either start date or enrollment end must be provided."

Comment on lines 380 to 402
if course_run:
if course_run_created:
stats["course_runs_created"].add(course_run.external_course_run_id)
log.info(
f"Created Course Run, title: {external_course.course_title}, external_course_run_id: {course_run.external_course_run_id}" # noqa: G004
)
elif course_run_updated:
stats["course_runs_updated"].add(course_run.external_course_run_id)
log.info(
f"Updated Course Run, title: {external_course.course_title}, external_course_run_id: {course_run.external_course_run_id}" # noqa: G004
)

if course_run_created:
stats["course_runs_created"].add(course_run.external_course_run_id)
log.info(
f"Created Course Run, title: {external_course.course_title}, external_course_run_id: {course_run.external_course_run_id}" # noqa: G004
)
elif course_run_updated:
stats["course_runs_updated"].add(course_run.external_course_run_id)
log.info(
f"Updated Course Run, title: {external_course.course_title}, external_course_run_id: {course_run.external_course_run_id}" # noqa: G004
f"Creating or Updating Product and Product Version, course run courseware_id: {course_run.external_course_run_id}, Price: {external_course.price}" # noqa: G004
)

log.info(
f"Creating or Updating Product and Product Version, course run courseware_id: {course_run.external_course_run_id}, Price: {external_course.price}" # noqa: G004
)

if external_course.price:
product_created, product_version_created = (
create_or_update_product_and_product_version(
external_course, course_run
if external_course.price:
product_created, product_version_created = (
create_or_update_product_and_product_version(
external_course, course_run
)
)
)
if product_created:
stats["products_created"].add(course_run.external_course_run_id)
if product_created:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: may be we can move this into separate functions to reduce nested blocks in single function

Copy link
Contributor

@asadali145 asadali145 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arslanashraf7 We are not validating the enrollment start date as it is not mentioned in the issue. With the current implementation, I can create a course run with enrollment start greater than enrollment end (both in future).

Comment on lines 685 to 698
course_run = CourseRun.objects.create(
external_course_run_id=external_course.course_run_code,
course=course,
title=external_course.course_title,
courseware_id=course_run_courseware_id,
run_tag=external_course.course_run_tag,
start_date=external_course.start_date,
end_date=external_course.end_date,
enrollment_end=external_course.enrollment_end,
live=True,
)
is_created = True
except ValidationError as e:
log.error(
f"Error creating course run for course: {course.readable_id}, course run code: {external_course.course_run_code}, error: {e}" # noqa: G004
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of trying to create a course run with bad data, can we validate the dates like we are validating the end date in update_external_course_runs? We won't need these extra changes and it would be much simpler. What are your thoughts?

@@ -757,14 +757,28 @@ def clean(self):
1. Later than end_date if end_date is set
2. Later than start_date if start_date is set
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should update docs as per current logic.

@arslanashraf7
Copy link
Contributor

@arslanashraf7 We are not validating the enrollment start date as it is not mentioned in the issue. With the current implementation, I can create a course run with enrollment start greater than enrollment end (both in future).

At the time of the creation of the ticket, The main goal was to keep the catalog consistent and the catalog course criteria take the start date and enrollment end date into consideration while deciding if the course should be visible on the catalog or not.

I do remember discussing this problem of date validation somewhere but don't remember where, or if was it verbal or written at that time. I would defer to @cachob and @pdpinch to make a decision on this.

@Anas12091101 Anas12091101 force-pushed the anas/add-courserun-date-validation branch 2 times, most recently from 225498f to 953ba0b Compare February 21, 2025 11:19
@Anas12091101 Anas12091101 force-pushed the anas/add-courserun-date-validation branch from fe39cb1 to 01585ab Compare February 21, 2025 11:24
@Anas12091101 Anas12091101 force-pushed the anas/add-courserun-date-validation branch from a8b09f1 to e8adab1 Compare February 21, 2025 13:11
@pdpinch
Copy link
Member

pdpinch commented Feb 21, 2025

Can you help me understand this from a systems level? Do we have a history of bad data from the external partners? I understand the validation will raise an error. Will those errors manifest in Sentry? Do we know who will watch out for them and take action?

@Anas12091101 Anas12091101 force-pushed the anas/add-courserun-date-validation branch from 9c1e654 to c7edd99 Compare February 21, 2025 16:00
@Anas12091101 Anas12091101 force-pushed the anas/add-courserun-date-validation branch from 52be026 to 935a91d Compare February 21, 2025 16:12
@Anas12091101
Copy link
Contributor Author

@pdpinch This PR mainly adds validations for course runs created manually via Django admin. Data from external vendors is usually correct, but we’ve added extra date checks before creating an external course run to prevent validation-related task failures.

Because of this, no validation errors will be raised or logged in Sentry when the task runs, even if the API sends bad data, it will just skip those course runs. Also, validation errors from the clean() method won’t show up in Sentry since Django already handles them and displays them on the form.

@pdpinch
Copy link
Member

pdpinch commented Feb 21, 2025

That's fine for the django admin.

Silent failures from the external partners' APIs might lead to confusion, but I'm not sure. @cachob maybe be able to comment, although it's probably out of scope for this PR.

Copy link
Contributor

@marslanabdulrauf marslanabdulrauf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we've added validation to prevent courses from being created with invalid dates, we should update our test cases accordingly.

Bypassing validation to create courses without these dates no longer makes sense.
@arslanashraf7 any thoughts ?

Comment on lines +680 to +683
elif (not start_date or start_date < now) and (
not enrollment_end or enrollment_end < now
):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
elif (not start_date or start_date < now) and (
not enrollment_end or enrollment_end < now
):
elif (start_date and start_date < now) or (enrollment_end and enrollment_end < now):

mitxpro/utils.py Outdated
now = now_in_utc()
error_msg = None

if not start_date and not enrollment_end:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its equivalent to:

Suggested change
if not start_date and not enrollment_end:
if not (start_date or enrollment_end):

Comment on lines 347 to 351
if not (
external_course.validate_end_date()
and not bool(
get_courserun_date_errors(
external_course.start_date,
external_course.end_date,
external_course.enrollment_end,
)
)
):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if not (
external_course.validate_end_date()
and not bool(
get_courserun_date_errors(
external_course.start_date,
external_course.end_date,
external_course.enrollment_end,
)
)
):
if not external_course.validate_end_date() or get_courserun_date_errors(
external_course.start_date,
external_course.end_date,
external_course.enrollment_end,
):

@Anas12091101
Copy link
Contributor Author

Bypassing validation to create courses without these dates no longer makes sense.

Sometimes we need to test for scenarios when course run is expired (having past dates). For this, I have added the clean_disabled attribute which helps us in creating a course run without running the clean() fn.

mitxpro/utils.py Outdated
@@ -646,3 +646,52 @@ def strip_datetime(date_str, date_format, date_timezone=None):

date_timezone = date_timezone if date_timezone else datetime.UTC
return datetime.datetime.strptime(date_str, date_format).astimezone(date_timezone)


def get_courserun_date_errors(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A better name would be validate_courserun_dates. It can return a tuple of True/False, Message

@@ -339,11 +344,20 @@ def update_external_course_runs(external_courses, keymap): # noqa: C901, PLR091
stats["course_runs_skipped"].add(external_course.course_run_code)
continue

if not external_course.validate_end_date():
if not (
external_course.validate_end_date()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are validating this in get_courserun_date_errors, Do we need this now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are validating this in get_courserun_date_errors, Do we need this now?

I think we are not validating end_date in get_courserun_date_errors. Should we add this validation there as well? Initially the ticket asked validations for start and enrollment_end date and that's why I didn't add the end date validations in get_courserun_date_errors


if self.end_date and self.expiration_date < self.end_date:
raise ValidationError("Expiration date must be later than end date.") # noqa: EM101
errorMsg = get_courserun_date_errors(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
errorMsg = get_courserun_date_errors(
error_msg = get_courserun_date_errors(

@cachob
Copy link

cachob commented Feb 26, 2025

@pdpinch / @Anas12091101 - I believe with the checks in place that we have for external courses should suffice. I don't see any urgent reason why these should take up Sentry overhead - if ever there is bad data in these endpoints it should be cleaned up by the vendors before they send it over as a best practice. I agree that this is out scope, and this could be dealt with the reporting tools we're looking to develop for our vendor APIs

@Anas12091101 Anas12091101 force-pushed the anas/add-courserun-date-validation branch 2 times, most recently from 1ac7031 to e270dbf Compare February 27, 2025 13:07
@Anas12091101 Anas12091101 force-pushed the anas/add-courserun-date-validation branch from dcf3f2a to 0bd3168 Compare February 27, 2025 14:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants