Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LambdaLoadTests not executing correctly #379

Closed
lumpfish opened this issue Nov 24, 2020 · 7 comments
Closed

LambdaLoadTests not executing correctly #379

lumpfish opened this issue Nov 24, 2020 · 7 comments
Assignees
Labels
Milestone

Comments

@lumpfish
Copy link
Contributor

lumpfish commented Nov 24, 2020

#216 added setInactivityLimit as a parameter to the LambdaLoadTest.

However, a bug in STF means that this has a side effect of preventing most of the tests in the test suite from running.

A typical test output is currently:

17:50:15  STF 17:50:14.928 - +------ Step 1 - Run lambda and stream load test
17:50:15  STF 17:50:14.928 - | Run foreground process
17:50:15  STF 17:50:14.928 - |   Program:     /home/jenkins/workspace/Test_openjdk8_j9_sanity.system_s390x_linux/openjdkbinary/j2sdk-image/bin/java
17:50:15  STF 17:50:14.928 - |   Mnemonic:    LT
17:50:15  STF 17:50:14.928 - |   Echo:        ECHO_ON
17:50:15  STF 17:50:14.928 - |   Expectation: CLEAN_RUN within 1h
17:50:15  STF 17:50:14.928 - |
17:50:15  STF 17:50:14.928 - Running command: /home/jenkins/workspace/Test_openjdk8_j9_sanity.system_s390x_linux/openjdkbinary/j2sdk-image/bin/java -Xgcpolicy:gencon -Xgc:concurrentScavenge -classpath /home/jenkins/workspace/Test_openjdk8_j9_sanity.system_s390x_linux/jvmtest/system/stf/stf.load/bin:/home/jenkins/workspace/Test_openjdk8_j9_sanity.system_s390x_linux/jvmtest/system/stf/stf.core/bin:/home/jenkins/workspace/Test_openjdk8_j9_sanity.system_s390x_linux/jvmtest/system/systemtest_prereqs/log4j-2.13.3/log4j-api-2.13.3.jar:/home/jenkins/workspace/Test_openjdk8_j9_sanity.system_s390x_linux/jvmtest/system/systemtest_prereqs/log4j-2.13.3/log4j-core-2.13.3.jar:/home/jenkins/workspace/Test_openjdk8_j9_sanity.system_s390x_linux/jvmtest/system/systemtest_prereqs/junit-4.12/junit-4.12.jar:/home/jenkins/workspace/Test_openjdk8_j9_sanity.system_s390x_linux/jvmtest/system/systemtest_prereqs/junit-4.12/hamcrest-core-1.3.jar:/home/jenkins/workspace/Test_openjdk8_j9_sanity.system_s390x_linux/jvmtest/system/openjdk-systemtest/openjdk.test.lambdasAndStreams/bin net.adoptopenjdk.loadTest.LoadTest -resultsDir /home/jenkins/workspace/Test_openjdk8_j9_sanity.system_s390x_linux/openjdk-tests/TKG/test_output_16058941429755/LambdaLoadTest_ConcurrentScavenge_0/20201120-175013-LambdaLoadTest/results -resultsPrefix 1.LT. -inactivityLimit 60m -reportFailureLimit 1 -abortAtFailureLimit 10 -maxTotalLogFileSpace 200M -maxSingleLogSize 1/25 -suite.lambda.threadCount 2 -suite.lambda.inventoryFile /home/jenkins/workspace/Test_openjdk8_j9_sanity.system_s390x_linux/openjdk-tests/TKG/test_output_16058941429755/LambdaLoadTest_ConcurrentScavenge_0/20201120-175013-LambdaLoadTest/results/1.LT.inventory/openjdk.test.load/config/inventories/lambdasAndStreams/lambda.xml -suite.lambda.inventoryExcludeFile none -suite.lambda.totalNumberTests 200 -suite.lambda.selection random -suite.lambda.seed -1 -suite.lambda.repeatCount 1 -suite.lambda.thinkingTime 0ms..0ms
17:50:15  STF 17:50:14.928 - Redirecting stderr to /home/jenkins/workspace/Test_openjdk8_j9_sanity.system_s390x_linux/openjdk-tests/TKG/test_output_16058941429755/LambdaLoadTest_ConcurrentScavenge_0/20201120-175013-LambdaLoadTest/results/1.LT.stderr
17:50:15  STF 17:50:14.928 - Redirecting stdout to /home/jenkins/workspace/Test_openjdk8_j9_sanity.system_s390x_linux/openjdk-tests/TKG/test_output_16058941429755/LambdaLoadTest_ConcurrentScavenge_0/20201120-175013-LambdaLoadTest/results/1.LT.stdout
17:50:15  STF 17:50:14.933 - Monitoring processes: LT 
17:50:15  LT  17:50:15.548 - Load test parameters
17:50:15  LT  17:50:15.552 -   Time limited         = true
17:50:15  LT  17:50:15.552 -   Time limit         = null
17:50:15  LT  17:50:15.552 -   abortIfOutOfMemory   = true
17:50:15  LT  17:50:15.552 -   reportFailureLimit   = 1
17:50:15  LT  17:50:15.552 -   abortAtFailureLimit  = 10
17:50:15  LT  17:50:15.552 -   maxTotalLogFileSpace = 209715200
17:50:15  LT  17:50:15.552 -   maxSingleLogSize     = 8388608
17:50:15  LT  17:50:15.552 - Parameters for suite 0
17:50:15  LT  17:50:15.552 -   Suite name     = lambda
17:50:15  LT  17:50:15.552 -   Number threads = 2
17:50:15  LT  17:50:15.552 -   Supplied seed  = -1
17:50:15  LT  17:50:15.553 -   Inventory file = /home/jenkins/workspace/Test_openjdk8_j9_sanity.system_s390x_linux/openjdk-tests/TKG/test_output_16058941429755/LambdaLoadTest_ConcurrentScavenge_0/20201120-175013-LambdaLoadTest/results/1.LT.inventory/openjdk.test.load/config/inventories/lambdasAndStreams/lambda.xml
17:50:15  LT  17:50:15.553 -   Exclude file   = none
17:50:15  LT  17:50:15.553 -   Number tests   = 200
17:50:15  LT  17:50:15.553 -   Repeat count   = 1
17:50:15  LT  17:50:15.553 -   Thinking time  = 0ms..0ms
17:50:15  LT  17:50:15.553 -   Selection mode = random
17:50:15  LT  17:50:15.553 -   Actual seed    = 1605434495915
17:50:15  LT  17:50:15.557 - Parsing inventory file. Root=/home/jenkins/workspace/Test_openjdk8_j9_sanity.system_s390x_linux/openjdk-tests/TKG/test_output_16058941429755/LambdaLoadTest_ConcurrentScavenge_0/20201120-175013-LambdaLoadTest/results/1.LT.inventory File=openjdk.test.load/config/inventories/lambdasAndStreams/lambda.xml
17:50:15  LT  17:50:15.570 - Final test list:
17:50:15  LT  17:50:15.570 -   0 JUnit[net.adoptopenjdk.test.lambda.ClassLibraryLambdaTests]  Weighting=1 
17:50:15  LT  17:50:15.570 -   1 JUnit[net.adoptopenjdk.test.lambda.TestLambdaCapture]  Weighting=1 
17:50:15  LT  17:50:15.570 -   2 JUnit[net.adoptopenjdk.test.lambda.TestLambdaContexts]  Weighting=1 
17:50:15  LT  17:50:15.570 -   3 JUnit[net.adoptopenjdk.test.lambda.TestLambdaDefaultsAndStatics]  Weighting=1 
17:50:15  LT  17:50:15.570 -   4 JUnit[net.adoptopenjdk.test.lambda.TestLambdaInferredTyping]  Weighting=1 
17:50:15  LT  17:50:15.570 -   5 JUnit[net.adoptopenjdk.test.lambda.TestLambdaJavaInterfaces]  Weighting=1 
17:50:15  LT  17:50:15.571 -   6 JUnit[net.adoptopenjdk.test.lambda.TestLambdaLexicalScoping]  Weighting=1 
17:50:15  LT  17:50:15.571 -   7 JUnit[net.adoptopenjdk.test.lambda.TestLambdaMethodReferences]  Weighting=1 
17:50:15  LT  17:50:15.571 -   8 JUnit[net.adoptopenjdk.test.lambda.TestLambdaMultithreaded]  Weighting=1 
17:50:15  LT  17:50:15.571 -   9 JUnit[net.adoptopenjdk.test.lambda.TestLambdaNested]  Weighting=1 
17:50:15  LT  17:50:15.571 -   10 JUnit[net.adoptopenjdk.test.lambda.TestLambdaRecursive]  Weighting=1 
17:50:15  LT  17:50:15.571 -   11 JUnit[net.adoptopenjdk.test.lambda.TestLambdasUserDefinedInterface]  Weighting=1 
17:50:15  LT  17:50:15.571 -   12 JUnit[net.adoptopenjdk.test.streams.TestParallelStreamOperations]  Weighting=1 
17:50:15  LT  17:50:15.571 -   13 JUnit[net.adoptopenjdk.test.streams.TestStreamOperations]  Weighting=1 
17:50:15  LT  17:50:15.609 - Starting thread. Suite=0 thread=0
17:50:15  LT  17:50:15.610 - Starting thread. Suite=0 thread=1
17:50:16  LT  17:50:15.768 - Thread completed. Suite=0 thread=1
17:50:16  LT  17:50:15.781 - Thread completed. Suite=0 thread=0
17:50:16  LT  17:50:15.810 - Test stopped due to reaching runtime limit
17:50:16  LT  17:50:15.810 - Load test completed
17:50:16  LT  17:50:15.811 -   Ran     : 2
17:50:16  LT  17:50:15.811 -   Passed  : 2
17:50:16  LT  17:50:15.811 -   Failed  : 0
17:50:16  LT  17:50:15.811 -   Result  : PASSED
17:50:16  STF 17:50:15.826 - Monitoring Report Summary:
17:50:16  STF 17:50:15.826 -   o Process LT  ended with the expected exit code (0)
17:50:16  STF 17:50:15.826 - EXECUTE stage completed
17:50:16  STF 17:50:15.832 - 
17:50:16  STF 17:50:15.832 - ====================   T E A R D O W N   ====================
17:50:16  STF 17:50:15.832 - Running teardown: perl /home/jenkins/workspace/Test_openjdk8_j9_sanity.system_s390x_linux/openjdk-tests/TKG/../TKG/test_output_16058941429755/LambdaLoadTest_ConcurrentScavenge_0/20201120-175013-LambdaLoadTest/tearDown.pl
17:50:16  STF 17:50:15.864 - TEARDOWN stage completed
17:50:16  STF 17:50:15.870 - 
17:50:16  STF 17:50:15.870 - =====================   R E S U L T S   =====================
17:50:16  STF 17:50:15.870 - Stage results:
17:50:16  STF 17:50:15.870 -   setUp:     pass
17:50:16  STF 17:50:15.870 -   execute:   pass
17:50:16  STF 17:50:15.870 -   teardown:  pass
17:50:16  STF 17:50:15.870 - 
17:50:16  STF 17:50:15.870 - Overall result: PASSED
17:50:16  
17:50:16  LambdaLoadTest_ConcurrentScavenge_0_PASSED

Note the following lines in the output:

17:50:15  LT  17:50:15.552 -   Time limited         = true
17:50:15  LT  17:50:15.552 -   Time limit         = null
......
17:50:16  LT  17:50:15.810 - Test stopped due to reaching runtime limit

When inactivityLimit is specified, STF is incorrectly setting timeLimitedTest to true. This in turn means that once the test starts its elapsed time is compared with the timeLimit of null which is interpreted as being exceeded and the test in terminated.

The test output should really look something like:

LT  15:26:34.103 - Starting thread. Suite=0 thread=0
LT  15:26:34.104 - Starting thread. Suite=0 thread=1
LT  15:26:54.154 - Completed 3.5%. Number of tests started=7
LT  15:27:14.183 - Completed 3.5%. Number of tests started=7 (+0)
LT  15:27:34.187 - Completed 3.5%. Number of tests started=7 (+0)
LT  15:27:54.151 - Completed 30.0%. Number of tests started=60 (+53)
LT  15:28:14.102 - Completed 30.0%. Number of tests started=60 (+0)
LT  15:28:34.106 - Completed 30.0%. Number of tests started=60 (+0)
LT  15:28:54.125 - Completed 54.5%. Number of tests started=109 (+49)
LT  15:29:14.190 - Completed 54.5%. Number of tests started=109 (+0)
LT  15:29:34.142 - Completed 54.5%. Number of tests started=109 (+0)
LT  15:29:54.099 - Completed 67.5%. Number of tests started=135 (+26)
LT  15:30:14.122 - Completed 67.5%. Number of tests started=135 (+0)
LT  15:30:34.097 - Completed 67.5%. Number of tests started=135 (+0)
LT  15:30:54.165 - Completed 78.0%. Number of tests started=156 (+21)
LT  15:31:14.100 - Completed 78.0%. Number of tests started=156 (+0)
STF 15:31:32.135 - Heartbeat: Process LT  is still running
LT  15:31:34.095 - Completed 78.0%. Number of tests started=156 (+0)
LT  15:31:54.128 - Completed 84.0%. Number of tests started=168 (+12)
LT  15:32:14.188 - Completed 84.0%. Number of tests started=168 (+0)
LT  15:32:31.557 - Thread completed. Suite=0 thread=0
LT  15:32:34.098 - Completed 100.0%. Number of tests started=200 (+32)
LT  15:32:54.189 - Completed 100.0%. Number of tests started=200 (+0)
LT  15:32:56.642 - Thread completed. Suite=0 thread=1
LT  15:32:56.699 - Load test completed
LT  15:32:56.699 -   Ran     : 200
LT  15:32:56.699 -   Passed  : 200
LT  15:32:56.699 -   Failed  : 0
LT  15:32:56.699 -   Result  : PASSED
STF 15:32:56.762 - Monitoring Report Summary:
STF 15:32:56.762 -   o Process LT  ended with the expected exit code (0)
STF 15:32:56.762 - EXECUTE stage completed
STF 15:32:57.190 - 
@lumpfish
Copy link
Contributor Author

lumpfish commented Dec 4, 2020

Raised adoptium/STF#90 to fix the inactivity timeout limit in STF.

@lumpfish
Copy link
Contributor Author

lumpfish commented Dec 4, 2020

After more testing with the fix for adoptium/STF#90 shows a couple of other issues:

  1. The TestParallelStreamOperations.java test case in the workload test requires the heap size to be set explicitly - relying on the JVM's default calculation based on the machine configuration may mean that insufficient heap is allocated causing the test to throw OutOfMemory exceptions. See LambdaLoadTest hang eclipse-openj9/openj9#10638 for more details.
  2. The TestParallelStreamOperations.java test case when run alongside the TestLambdaRecursive.java test case caused intermittent StackOverflow exceptions which do not occur when TestParallelStreamOperations.java is excluded.

The TestParallelStreamOperations.java uses c. 300Mb heap per invocation on 64bit jvms, and the test contains code which limits the number of concurrent instances of the test within a jvm at one time to two. Also, because these tests take much longer to complete than any other tests in the workload a test workload with the current settings of 200 tests actually runs the two instances of TestParallelStreamOperations.java and 198 tests from the remaining tests in the workload (or invocations of TestParallelStreamOperations.java which terminate immediately because they detect there are already two instances running).

I suggest therefore that TestParallelStreamOperations.java test be split out from the current LambdaLoadTest replacing the current workload by two different ones:

  1. ParallelStreamsTest - runs only the TestParallelStreamOperations.java once in two concurrent threads with the required heap size
    2, LambdaLoadTest which runs the remaining tests in nbr_cpus - 2 threads for a few minutes with the jvm default heap size (the nbr of tests to be calibrated to achieve that target).

@Mesbah-Alam , @smlambert - are you OK with this suggestion? If so I'll go ahead and test whether it resolves the issues we are seeing.

@smlambert
Copy link
Contributor

That seems like a reasonable approach @lumpfish (and gets these tests back to running...)

I am assuming you are satisfied that the proposal still means the (newly split out) LambdaLoadTests exercise the JVM sufficiently to achieve their stated intent (from my high level read of this, that is the case).

@Mesbah-Alam
Copy link
Contributor

The change sounds reasonable to me too. Thanks for discovering this issue @lumpfish. Please create the PR with the change.

@lumpfish
Copy link
Contributor Author

lumpfish commented Dec 7, 2020

I am assuming you are satisfied that the proposal still means the (newly split out) LambdaLoadTests exercise the JVM sufficiently to achieve their stated intent (from my high level read of this, that is the case).

The test workload is comprised of tests which exercise specific areas of jdk functionality which are then run repeatedly and on multiple threads, The goal is primarily to uncover any reliability issues in the jdk class libraries, jit compilation or gc. The tests are not specifically designed to coexist with others, and resource hungry tests (such as the TestParallelStreamsOperations test) can affect other tests in the workload in unforeseen ways, Running that test separately is the best approach I think - the jvm can be given the options required to ensure that test will run successfully, so that any failures can be relied on as being issues with the jdk), whereas the remaining tests in the workload can be treated more flexibly in terms of allowing the jdk to select a heap size based on the memory available in the test machine and allowing the load test framework to select the number of threads to run in the workload based on the cpus available.

I'll work on a PR to split the tests and calibrate both the new workloads.

@lumpfish
Copy link
Contributor Author

lumpfish commented Feb 3, 2021

PR's:

  1. Change to the tests to run the TestParallelStreamsOperations test in its own workload: Move TestParallelStreamOperations to its own workload #401
  2. Change to the playlist to run the new ParallelStreamsLoadTest: Create ParallelStreamsLoadTest aqa-tests#2229

@lumpfish
Copy link
Contributor Author

lumpfish commented Feb 8, 2021

Fixed by #401 and adoptium/aqa-tests#2229, closing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants