-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Wyscout V3] Add substitutions to the event stream #368
base: master
Are you sure you want to change the base?
[Wyscout V3] Add substitutions to the event stream #368
Conversation
# Conflicts: # kloppy/infra/serializers/event/wyscout/deserializer_v3.py # kloppy/tests/test_wyscout.py
@probberechts does this look like a clean implementation? |
# Step 2: Sort substitution events globally by period and timestamp | ||
substitution_events.sort(key=lambda e: (e.period.id, e.timestamp)) | ||
|
||
# Step 3: Merge events and substitutions in ascending order | ||
merged_events = [] | ||
sub_index = 0 | ||
total_subs = len(substitution_events) | ||
|
||
for event in events: | ||
# Insert all substitution events that occur before or at the current event's timestamp | ||
while sub_index < total_subs: | ||
sub_event = substitution_events[sub_index] | ||
if sub_event.period.id < event.period.id or ( | ||
sub_event.period.id == event.period.id | ||
and sub_event.timestamp <= event.timestamp | ||
): | ||
merged_events.append(sub_event) | ||
sub_index += 1 | ||
else: | ||
break | ||
|
||
# Add the current event to the merged list | ||
merged_events.append(event) | ||
|
||
# Step 4: Add any remaining substitution events | ||
while sub_index < total_subs: | ||
merged_events.append(substitution_events[sub_index]) | ||
sub_index += 1 | ||
|
||
return merged_events |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can be write much simpler using
import bisect
def insert(event, sorted_events):
pos = bisect.bisect_left([e.time for e in sorted_events], event.time)
events.insert(pos, event)
for sub_event in substitution_events:
insert(sub_event, events)
I am also not entirely sure whether sorting only on time is sufficient. I suppose it can happen that a regular event (e.g., a ball out or throw-in) and a substitution happen in the same second. In that case, the substitution should become before events that restart the game and after events that stop the game.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the help. I've refactored the code to use your more simple design.
In the test file that we use, I did not find any regular events that occur in the same second as the substitutions. But I'm indeed not sure, whether this will be generally true.
Do you think we should already start with merging this implementation of adding substitution events for the Wyscout V3 parser, and later in a separate issue/PR work on better handling the edge case of when a substitution event and a regular event happen on the same second?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@probberechts do you think we can already merge this PR? And in a separate issue/PR work on better handling the edge case of when a substitution event and a regular event happen on the same second (if this even ever occurs)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@probberechts what do you think? :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DriesDeprest See #415
# Conflicts: # kloppy/infra/serializers/event/wyscout/deserializer_v3.py
Discussed in: #367