Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

i#7157 dyn inject: Improve ord tracking with synthetic records #7299

Open
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

abhinav92003
Copy link
Contributor

@abhinav92003 abhinav92003 commented Feb 24, 2025

Fixes handling of some cases related to ordinal tracking when there are queued synthetic records.

get_input_record_ordinal(), which has existing logic to discount records that are read-ahead by the scheduler, was missing logic to correctly account for synthetic records in the queue. This caused a higher count to be subtracted from the
record ordinal returned by the input stream, particularly when the queued records include the (possibly many) injected switch records and a new input is switched to (in which case the guarding condition for the read-ahead adjustment was satisfied).

Adds tracking for whether each queued record is real, that is, read from the input reader (it could alternatively be synthetic). Also adds an assert to ensure we don't underflow in get_input_record_ordinal() which is triggered without this fix.

Fixes an issue with instrs_pre_read tracking which should be decremented only when a real record is passed on to the caller. There is likely an additional issue with premature decrement of instrs_pre_read; added a FIXME for now as it is beyond the scope of this PR.

Adds a scheduler unit_test that verifies the bookkeeping for the USE_INPUT_ORDINALS case, which was affected by the above issues.

The added scheduler unit_test also tests the on_context_switch() change made by #7302 (which removed the use of get_instruction_ordinal() to detect whether to insert the context switch sequence) for the USE_INPUT_ORDINALS case. Adds a FIXME by a use of get_instruction_ordinal() that should be fixed as the intended value there is the global one, which may be more complicated.

Issue: #7157

Fixes handling of some cases related to ordinal tracking when there are queued synthetic records.

get_input_record_ordinal, which has existing logic to discount records that are read-ahead by the scheduler, was missing
logic to correctly account for synthetic records in the queue. This caused a higher count to be subtracted from the
record ordinal returned by the input stream, particularly when a new input is switched to (in which case the guarding
condition for the read-ahead adjustment was satisfied).

Added tracking for whether each queued record is real, that is, read from the input reader or synthetic.

Fixes an issue with instrs_pre_read  tracking which should be decremented only when a real record is passed on to the caller.

Adds a test that verifies the bookkeeping for use_input_ordinals case, which was affected by the above issue.

Issue: #7157
@abhinav92003 abhinav92003 changed the title i#7157 dyn inject: Improve ord tracking for synthetic records i#7157 dyn inject: Improve ord tracking with synthetic records Feb 24, 2025
abhinav92003 added a commit that referenced this pull request Feb 25, 2025
Adds the count of instances where the context switch sequence was injected to the schedule_stats tool output.

Modifies the switch_insertion test to use schedule_stats instead of basic_counts. This helps reveal a bug in context switch trace template insertion that causes the injection to happen many more times than the context switch events.

This diff is split out from #7299 which fixes the above bug.

Issue: #7157
abhinav92003 added a commit that referenced this pull request Feb 25, 2025
Fixes a bug in the dynamic injection logic in the drmemtrace scheduler
that caused the context switch sequence to be injected more times than
actual context switches. It was erroneously injected also when
set_cur_input() was invoked without any change to the input.

update_switch_stats() is a better control point for switch sequence
injection, which is renamed to on_context_switch() here. Now, in the
switch_insertion offline test, we have 11 context switch injections (==
input-to-input + idle-to-input switches), instead of the prior 363.

We deliberately do not inject the context switch sequence on
input-to-idle transitions. Note that we already inject on idle-to-input
transitions, and it is not clear yet whether we want to repeat it at
input-to-idle also or maybe inject some different sequence.

The above move also removes the use of get_instruction_ordinal() to
determine whether switch records should be inserted. Existing logic
checked for get_instruction_ordinal() > 0 to avoid inserting switch
records before the very first input on an output. This does not work as
required when USE_INPUT_ORDINALS is set, as then the input-local
instruction ordinal instead (which is non-zero as it is not adjusted for
the read-ahead instructions) is returned, which ends up inserting the
switch sequence before the very first input also. A scheduler unit_test
that uses USE_INPUT_ORDINALS is coming up in #7299 with more fixes.

Issue: #7157
bool cur_from_queue;
bool is_cur_record_real;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add an initial value. Ditto for line 210 (I think it's set in the code before read so no current bug I don't think, but safest to have an initial value).

input.queue.push_front(pid);
input.queue.push_front(tid);
input.queue.emplace_front(pid, IS_SYNTHETIC);
input.queue.emplace_front(tid, IS_SYNTHETIC);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm so this doesn't need to update real_records_in_queue b/c these are synthetic.

Maybe a helper is needed so adding goes through central places so no addition overlooks updating real_records_in_queue? I'm afraid someone will see this code and add a real record to the queue here and not update the counter b/c the existing code doesn't update it. Or just adding a comment here saying we don't need to increment is enough?

@@ -1821,10 +1832,14 @@ scheduler_impl_tmpl_t<RecordType, ReaderType>::get_input_record_ordinal(
return 0;
uint64_t ord = inputs_[index].reader->get_record_ordinal();
if (get_instr_ordinal(inputs_[index]) == 0) {
uint64_t adjust =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems this should go after the comment on lines 1839-1841.

@@ -1932,8 +1950,9 @@ scheduler_impl_tmpl_t<RecordType, ReaderType>::advance_region_of_interest(
if (input.cur_region > 0) {
VPRINT(this, 3, "skip_instructions input=%d: inserting separator marker\n",
input.index);
input.queue.push_back(record);
input.queue.emplace_back(record, is_record_real);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this need to update the count if is_record_real? That helper is looking better now.

}
}
VPRINT(this, 5,
"next_record[%d]: candidate record from %d (@%" PRId64 "): ", output,
input->index, get_instr_ordinal(*input));
if (input->instrs_pre_read > 0 && record_type_is_instr(record))
// FIXME: This is likely too premature; we should either move it to later,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an issue #?

@@ -2727,6 +2763,9 @@ scheduler_impl_tmpl_t<RecordType, ReaderType>::update_next_record(output_ordinal
VPRINT(this, 2, "output %d base timestamp = %zu\n", output,
outputs_[output].base_timestamp);
}
// FIXME: When USE_INPUT_ORDINALS is enabled, this returns the input-local
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my head USE_INPUT_ORDINALS is never set for a dynamic schedule: but I actually don't see the docs or code disallowing that. I think we should disallow it. Please list that as the preferred solution; maybe that downgrades this to XXX as no known use case would get here.

@@ -5770,9 +5770,11 @@ test_unscheduled()
}

static void
test_kernel_switch_sequences()
test_kernel_switch_sequences(bool use_input_ordinals)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my other comment: I don't think we should bother trying to support USE_INPUT_ORDINALS with a dynamic schedule: no known use case wants that. I think we should just disallow it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dynamic schedule and USE_INPUT_ORDINALS seems orthogonal to each other. USE_INPUT_ORDINALS is supposed to affect only what the stream APIs return, not how the scheduler behaves. I don't see why we should disallow that combination.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no known use case wants that

Even if that is the case, the restriction is not intuitive. Are there other concerns in handling this combination?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This combination is adding complexity and causing extra work with no benefit: better to scope-limit than complicate the code for no benefit. USE_INPUT_ORDINALS was added solely for analyzer use.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the restriction is not intuitive.

I disagree: USE_INPUT_ORDINALS is all about ignoring the output streams and only looking at the inputs; which implies you don't care where the inputs are located -- you don't care about the schedule or whether they are interleaved.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.

I guess I need to first find some other real case that would hit this need of tracking real/synthetic record count in the queue. I can think of some possible cases but they're harder to write a test for to confirm:

  • missing correction for record/instr ordinal when some real record was put back in the queue
  • premature decrement of instr_pre_read causing get_instr_record to return a smaller value

I'll go ahead with #7277 now (which is where I started looking into this issue), because it doesn't look like it is affected: syscall sequences are never injected without seeing an instr on an input, which means the record/instr ordinal over-adjustment gating logic (get_instr_ordinal() == 0) would never be invoked when syscall injected instrs are in the queue. They may be some tricky interaction with instr_pre_read (since syscall injected instrs would also cause it to be decremented) but maybe that's not an issue either because the instr after the syscall would not even be read from the input reader until all syscall injected instrs are done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants