Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RPS mode with cold/warm ratio #557

Merged
merged 32 commits into from
Dec 6, 2024
Merged

RPS mode with cold/warm ratio #557

merged 32 commits into from
Dec 6, 2024

Conversation

cvetkovic
Copy link
Contributor

@cvetkovic cvetkovic commented Nov 15, 2024

  • Support for RPS mode
  • Reimplementation of individual trace driver
  • Specification generator bugfixing
  • Code stability improvement (a lot of new unit and integration tests)
  • Workload code deduplication
  • Knative deployment parallelization
  • Code moving and beautification

@cvetkovic cvetkovic changed the title RPS mode - new feature RPS mode with cold/warm ratio Nov 15, 2024
@cvetkovic
Copy link
Contributor Author

@leokondrashov: This is the last PR from the big one. Please have a look thoroughly at this once, since it affects the core functionality of the loader. After this one, I will follow up with a new one that will extract Dirigent-specific features/settings.

Copy link
Contributor

@leokondrashov leokondrashov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should clarify the interpretation of the IAT array; only then can I check the driver and spec generation code. Right now, it seems inconsistent in the tests for RPS and trace modes.

pkg/trace/parser.go Outdated Show resolved Hide resolved
pkg/workload/openwhisk/workload_openwhisk.go Outdated Show resolved Hide resolved
pkg/driver/trace_driver.go Outdated Show resolved Hide resolved
pkg/driver/trace_driver.go Outdated Show resolved Hide resolved
pkg/driver/trace_driver.go Outdated Show resolved Hide resolved
pkg/generator/rps_test.go Outdated Show resolved Hide resolved
pkg/common/specification_types.go Show resolved Hide resolved
pkg/generator/rps_test.go Outdated Show resolved Hide resolved
pkg/generator/specification.go Outdated Show resolved Hide resolved
pkg/generator/rps_test.go Show resolved Hide resolved
Signed-off-by: Lazar Cvetković <[email protected]>
Signed-off-by: Lazar Cvetković <[email protected]>
Signed-off-by: Lazar Cvetković <[email protected]>
Signed-off-by: Lazar Cvetković <[email protected]>
Signed-off-by: Lazar Cvetković <[email protected]>
Signed-off-by: Lazar Cvetković <[email protected]>
Signed-off-by: Lazar Cvetković <[email protected]>
Signed-off-by: Lazar Cvetković <[email protected]>
Signed-off-by: Lazar Cvetković <[email protected]>
pkg/generator/specification_test.go Outdated Show resolved Hide resolved
pkg/generator/specification.go Outdated Show resolved Hide resolved
pkg/generator/specification.go Outdated Show resolved Hide resolved
Signed-off-by: Lazar Cvetković <[email protected]>
Copy link
Contributor

@leokondrashov leokondrashov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor issues in the tests for the generator; otherwise, looks good. But I have concerns about the driver now. Since we are not necessarily hit the minute border, we should be careful how it is handled (especially, startOfMinute, previousIATSum, and minuteIndex for empty minutes)

pkg/generator/specification_test.go Outdated Show resolved Hide resolved
pkg/generator/specification_test.go Outdated Show resolved Hide resolved
pkg/generator/specification_test.go Outdated Show resolved Hide resolved
pkg/generator/rps_test.go Outdated Show resolved Hide resolved
pkg/generator/rps.go Outdated Show resolved Hide resolved
pkg/driver/trace_driver.go Outdated Show resolved Hide resolved
docs/configuration.md Outdated Show resolved Hide resolved
pkg/driver/trace_driver.go Outdated Show resolved Hide resolved
pkg/driver/trace_driver.go Outdated Show resolved Hide resolved
pkg/driver/trace_driver.go Outdated Show resolved Hide resolved
Copy link
Contributor

@leokondrashov leokondrashov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor fixes in specification tests again. There is a change in the driver that is inconsistent with the rest of the code: experiment duration with second granularity. In the front-end we interpret experiment duration as seconds, but in driver, now, we try to interpret it as number of minutes. See the comment for functionsDriver.

pkg/generator/specification_test.go Outdated Show resolved Hide resolved
pkg/generator/specification_test.go Outdated Show resolved Hide resolved
pkg/generator/specification_test.go Outdated Show resolved Hide resolved
pkg/generator/specification_test.go Outdated Show resolved Hide resolved
pkg/generator/rps_test.go Outdated Show resolved Hide resolved
pkg/driver/trace_driver.go Outdated Show resolved Hide resolved
pkg/driver/trace_driver.go Outdated Show resolved Hide resolved
pkg/driver/trace_driver.go Outdated Show resolved Hide resolved
Copy link
Contributor

@leokondrashov leokondrashov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix warmup handling. Looks fine to me, but I want to test-run it on several experiments to be sure that nothing goes wrong.

pkg/generator/rps_test.go Outdated Show resolved Hide resolved
pkg/driver/trace_driver.go Outdated Show resolved Hide resolved
@leokondrashov
Copy link
Contributor

leokondrashov commented Dec 4, 2024

Unfortunately, I see a problem with warmup and RPS mode: (I modified the config file to include 1 minute of warmup)

$ go run cmd/loader.go -config cmd/config_knative_rps.json -verbosity debug
WARN[Dec  4 06:14:18.474] It is recommended that the first 10% of cold starts are discarded from the experiment results for low cold start RPS.
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x30 pc=0x999be7]

goroutine 1 [running]:
github.com/vhive-serverless/loader/pkg/trace.profileConcurrency(...)
        /users/lkondras/loader/pkg/trace/trace_profiler.go:87
github.com/vhive-serverless/loader/pkg/trace.DoStaticTraceProfiling({0xc0000908f0, 0x1, 0xc0001ffba8?})
        /users/lkondras/loader/pkg/trace/trace_profiler.go:38 +0x107
github.com/vhive-serverless/loader/pkg/driver.(*Driver).RunExperiment(0xc0000b9980)
        /users/lkondras/loader/pkg/driver/trace_driver.go:488 +0x38
main.runRPSMode(0xc00019cc60, 0x0, 0x0)
        /users/lkondras/loader/cmd/loader.go:236 +0x29e
main.main()
        /users/lkondras/loader/cmd/loader.go:111 +0x479
exit status 2

After closer inspection, I see that there are problems with RPS mode and warmup together: we generate only the experiment duration, but run for experiment duration + warmup.

I think the whole warmup might be broken now: previously, we were skipping one minute for profiling during driver execution, and now we do the invocation even for the minute that we were using for static profiling.

Signed-off-by: Lazar Cvetković <[email protected]>
Signed-off-by: Lazar Cvetković <[email protected]>
@cvetkovic
Copy link
Contributor Author

@leokondrashov: I think we should also change names of ExperimentDuration and WarmupDuration parameters and add Minute suffix, since it becomes unclear how this works if Granularity is second. It might be written in the doc, but we might have some bugs around this in the code now or in the future. What do you think?

@leokondrashov
Copy link
Contributor

I agree that it is extremely confusing. We should take care of that.

Right now, I've checked the loader once more, and it works as expected. I think we can merge this one and add an issue about the granularity. And work on the existing issues one-by-one so that the review process would be fast.

@cvetkovic
Copy link
Contributor Author

Ok. I have created an issue. I will proceed with merging then.

@cvetkovic cvetkovic merged commit 9b40286 into main Dec 6, 2024
14 checks passed
@cvetkovic cvetkovic deleted the rps_feature branch December 6, 2024 13:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants