Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Epoch issues #1544

Closed
1 task done
aggieNick02 opened this issue Mar 22, 2023 · 1 comment · Fixed by #1621
Closed
1 task done

Epoch issues #1544

aggieNick02 opened this issue Mar 22, 2023 · 1 comment · Fixed by #1621

Comments

@aggieNick02
Copy link
Contributor

Please acknowledge the following before creating a ticket

Description of the bug:
By default, fio log files use timestamps based on an epoch that is the beginning of the job. If you have multiple jobs, especially with some jobs waiting on other jobs before starting, this means you have different epochs and the times in the logs for each job cannot be compared/ordered.

Using log_alternate_epoch solves this problem as the logs all then share a common epoch. Unfortunately, it introduces a new problem. That is, the start time for each job relative to this common epoch is not known. So the gap between a job starting and the first log entry is indeterminate (although typically small, 0-1ms). (Less critically, log_alternate_epoch also increases log file size because the number of characters for each log entry timestamp increases.)

The shortcomings of each approach can be resolved by using the beginning of job epoch for log files but recording the start time of each job against a common (alternate) epoch.

I put something like this together in this PR from a while back: https://github.com/axboe/fio/pull/1353/commits

Back then I was more concerned about correlating log entries against other events on the same machine, and the PR was aimed at simplifying log files while still being able to know time against a machine-based epoch. It was nice to have but not super important (imho).

But as I've started to work with fio experiments with multiple jobs, having timekeeping shortcomings both with the normal epoch and log_alternate_epoch has led me to the same solution (the PR) to a different problem.

I was hoping the PR could be reconsidered or maybe some other solution brainstormed? I'm happy to implement a solution, whether it looks like the PR or something completely different.
Environment:
All

fio version:
All

Reproduction steps
A simple fio run with per sample logging, each job waiting on the previous, illustrates the issues

aggieNick02 added a commit to PCPartPicker/fio that referenced this issue Aug 24, 2023
records the job start time obtained via a call to clock_gettime using
the clock_id specified by the clock_id option.

Rename the alternate_epoch_clock_id option to clock_id, as now this
clock_id serves two purposes. The primary purpose is to be the clock_id
for recording clock_gettime_job_start. The secondary purpose is to be
the clock_id used if log_alternate_epoch is specified, in which case
each log file timestamp is based on the epoch specified by clock_id.
(Each such timestamp is obtained by taking the traditional zero-based
timestamps and adding clock_gettime_job_start to them.)

We also make log_unix_epoch an official alias of log_alternate_epoch,
instead of maintaining both redundant options.

Fixes axboe#1544

Signed-off-by: Nick Neumann [email protected]
aggieNick02 added a commit to PCPartPicker/fio that referenced this issue Aug 24, 2023
points.

Add a new key in the json per-job output, clock_gettime_job_start, that
records the job start time obtained via a call to clock_gettime using
the clock_id specified by the clock_id option. This allows times of fio
jobs and log entries to be compared/ordered against each other and
against other system events recorded against the same clock_id.

Rename the alternate_epoch_clock_id option to clock_id, as now this
clock_id serves two purposes. The primary purpose is to be the clock_id
for recording clock_gettime_job_start. The secondary purpose is to be
the clock_id used if log_alternate_epoch is specified, in which case
each log file timestamp is based on the epoch specified by clock_id.
(Each such timestamp is obtained by taking the traditional zero-based
timestamps and adding clock_gettime_job_start to them.)

We also make log_unix_epoch an official alias of log_alternate_epoch,
instead of maintaining both redundant options.

Fixes axboe#1544

Signed-off-by: Nick Neumann [email protected]
aggieNick02 added a commit to PCPartPicker/fio that referenced this issue Aug 24, 2023
Add a new key in the json per-job output, clock_gettime_job_start, that
records the job start time obtained via a call to clock_gettime using
the clock_id specified by the clock_id option. This allows times of fio
jobs and log entries to be compared/ordered against each other and
against other system events recorded against the same clock_id.

Rename the alternate_epoch_clock_id option to clock_id, as now this
clock_id serves two purposes. The primary purpose is to be the clock_id
for recording clock_gettime_job_start. The secondary purpose is to be
the clock_id used if log_alternate_epoch is specified, in which case
each log file timestamp is based on the epoch specified by clock_id.
(Each such timestamp is obtained by taking the traditional zero-based
timestamps and adding clock_gettime_job_start to them.)

We also make log_unix_epoch an official alias of log_alternate_epoch,
instead of maintaining both redundant options.

Fixes axboe#1544

Signed-off-by: Nick Neumann [email protected]
aggieNick02 added a commit to PCPartPicker/fio that referenced this issue Aug 24, 2023
Add a new key in the json per-job output, clock_gettime_job_start, that
records the job start time obtained via a call to clock_gettime using
the clock_id specified by the clock_id option. This allows times of fio
jobs and log entries to be compared/ordered against each other and
against other system events recorded against the same clock_id.

Rename the alternate_epoch_clock_id option to clock_id, as now this
clock_id serves two purposes. The primary purpose is to be the clock_id
for recording clock_gettime_job_start. The secondary purpose is to be
the clock_id used if log_alternate_epoch is specified, in which case
each log file timestamp is based on the epoch specified by clock_id.
(Each such timestamp is obtained by taking the traditional zero-based
timestamps and adding clock_gettime_job_start to them.)

We also make log_unix_epoch an official alias of log_alternate_epoch,
instead of maintaining both redundant options.

Fixes axboe#1544

Signed-off-by: Nick Neumann [email protected]
@aggieNick02
Copy link
Contributor Author

Suggested modified approach is up: #1617

aggieNick02 added a commit to PCPartPicker/fio that referenced this issue Sep 1, 2023
Add a new key in the json per-job output, job_start, that records the
job start time obtained via a call to clock_gettime using the clock_id
specified by the new job_start_clock_id option. This allows times of fio
jobs and log entries to be compared/ordered against each other and
against other system events recorded against the same clock_id.

Also make log_unix_epoch an official alias of log_alternate_epoch,
instead of maintaining both redundant options.

Add a note to the documentation for group_reporting about how there are
several per-job values for which only the first job's value is recorded
in the json output format when group_reporting is enabled.

Fixes axboe#1544

Signed-off-by: Nick Neumann [email protected]
aggieNick02 added a commit to PCPartPicker/fio that referenced this issue Sep 8, 2023
Add a new key in the json per-job output, job_start, that records the
job start time obtained via a call to clock_gettime using the clock_id
specified by the new job_start_clock_id option. This allows times of fio
jobs and log entries to be compared/ordered against each other and
against other system events recorded against the same clock_id.

Also make log_unix_epoch an official alias of log_alternate_epoch,
instead of maintaining both redundant options.

Add a note to the documentation for group_reporting about how there are
several per-job values for which only the first job's value is recorded
in the json output format when group_reporting is enabled.

Fixes axboe#1544

Signed-off-by: Nick Neumann [email protected]
@axboe axboe closed this as completed in 12d325c Sep 11, 2023
jwieleRH pushed a commit to jwieleRH/fio that referenced this issue Nov 11, 2023
Add a new key in the json per-job output, job_start, that records the
job start time obtained via a call to clock_gettime using the clock_id
specified by the new job_start_clock_id option. This allows times of fio
jobs and log entries to be compared/ordered against each other and
against other system events recorded against the same clock_id.

Add a note to the documentation for group_reporting about how there are
several per-job values for which only the first job's value is recorded
in the json output format when group_reporting is enabled.

Fixes axboe#1544

Signed-off-by: Nick Neumann [email protected]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant