Refactor Load and Performance Doc #2140

rachelwhitton · 2017-01-26T18:12:59Z

Closes #1851
Replaces #2109

Effect

PR includes the following changes:

Rewrite of performance test documentation

Todo:

Peer review from another EOM (@aeligature?) and/or @ari-gold
Additional copy review from @alexfornuto following peer review

…ntation into bentekwork-load-test-1851

rachelwhitton · 2017-01-26T18:13:56Z

source/_docs/load-and-performance-testing.md

+###Warning {.info}
+We do not recommend load testing on the Live environment if the site has already launched because you risk overwhelming your live site and causing downtime.
+</div>
+Note the start time for the test. As the test executes, it's a good idea to keep a close eye on [log files](/docs/logs). Make note of any errors and warnings that pop up during test to fix.


@bentekwork Which log files should I watch? Do they differ based on the type of test being run?

rachelwhitton · 2017-01-26T18:14:21Z

source/_docs/load-and-performance-testing.md

+3. Determine how much load to apply for your test.
+
+  * **Performance Tests**: Smaller loads should suffice, as you should be able to see transactional bottlenecks with 10-20 concurrent users.
+  * **Load Tests**: Determine how many concurrent users the site is expected to serve based on historical analytics for the site. Identify the peak hourly sessions and average session duration, then do some math: `hourly_sessions / (60 / average_duration) = Concurrent Users`


@bentekwork How do I determine load to apply in the test after calculating concurrent users?

adamedgmond

Seems thorough on first and second reading.

ari-gold

Great update! This is coming along nicely. Added a few comments. Would be happy to review again.

ari-gold · 2017-01-27T17:51:21Z

source/_docs/load-and-performance-testing.md

+### Performance Testing
+Performance testing is the process in which you measure an application's response time to proactively expose bottlenecks. These tests should be regularly executed as part of routine maintenance. Additionally, you should run these test before any load testing. If your application is not performing well, then you can be assured that the load test will not go well.  
+
+The scope of performance tests should be limited to the application itself on a development environment (Dev or [Multidev](/docs/multidev)) without caching. This will give you an honest look into your application and show exactly how uncached requests will perform. You can bypass cache by [setting the `no-cache` HTTP headers](/docs/cache-control) in responses.


Offer alternatives to bypass cache by setting a no-cache header? How about just disabling cache completely on Dev/Multidev during testing through Drupal/WordPress Admin UI?

My understanding is that the dev environment has a default time-to-live of zero for dev, which implies no caching, but that things like Pantheon Advanced Page Cache may override this to be non-zero value. While a no-cache header may help, this may depend on when this get executed. Suggesting to disable caching via the UI is an option, with an emphasis to remember to re-enable prior to pushing to prod.

ari-gold · 2017-01-27T17:53:09Z

source/_docs/load-and-performance-testing.md

+### Load Testing
+Load testing is the process in which you apply requests to your site that will represent the most load that your site will face once it is live.  This test will ensure that the site can withstand the peak traffic spikes after launch. This test should be done on the Live environment before the site has launched, after performance testing.
+
+If your site is already live, then you should run load tests on the Test environment. Keep in mind that the Test environment has one application container, while Live environments on sites with a service level of Business and above can have multiple application containers serving the site. So try to run a proportionate amount of traffic based on how many containers you currently have on your Live environment.


Offer concrete example with math?

The EOM team is the best source for the algorithm we use.

ari-gold · 2017-01-30T23:17:52Z

source/_docs/load-and-performance-testing.md

+3. Determine how much load to apply.
+
+  * **Performance Tests**: Smaller loads should suffice, as you should be able to see transactional bottlenecks with 10-20 concurrent users.
+  * **Load Tests**: Determine how many concurrent users the site is expected to serve based on historical analytics for the site. Identify the peak hourly sessions and average session duration, then do some math: `hourly_sessions / (60 / average_duration) = Concurrent Users`


Let's reiterate difference between load test on Live vs non-live, and include app containers in calculation for scenario.

Load tests should not be run on Test, rather performance test can/should be run there. In terms of providing formulas, it is complicated by the fact that to run "proportionate amount of traffic" on Test involves knowing the number of appservers on Live, which clients can't determine on their own (other than asking Support, or looking at New Relic, which will include decommissioned appservers for some time).

ari-gold · 2017-01-30T23:21:10Z

source/_docs/load-and-performance-testing.md

+
+Finally, review the **Error analytics** tab in New Relic. PHP errors often indicate huge performance bottlenecks. If you have errors, fix them.
+
+### Calculating Load Capacity After Launch


How can we highlight this scenario? And flesh it out with concrete example explaining how to collect RPM and response time from New Relic?

ari-gold · 2017-01-30T23:26:29Z

source/_docs/load-and-performance-testing.md

+## Load vs Performance Testing
+Before you start, it's important to understand the difference between load and performance testing and know when to use each.
+### Performance Testing
+Performance testing is the process in which you measure an application's response time to proactively expose bottlenecks. These tests should be regularly executed as part of routine maintenance. Additionally, you should run these test before any load testing. If your application is not performing well, then you can be assured that the load test will not go well.  


Why should these tests be run regularly as part of routine maintenance? To ensure performance doesn't degrade with a code or configuration change?

In general, I'd favor suggesting that clients:

"refer to New Relic reports regularly to identify improvements or degradation of performance

"perform performance test occasionally to proactively exposed potential bottlenecks and to identify opportunities for optimization" and to

perform load tests in advance of anticipated major-traffic events, or prior to launching sites after major overhauls, remembering to provide enough time to fix any issues identified".

Added some of these notions.

ari-gold · 2017-01-30T23:28:20Z

source/_docs/load-and-performance-testing.md

+  * [Jmeter](http://jmeter.apache.org)
+  * [Locust](http://locust.io/)
+
+  The Pantheon onboarding team uses Locust, an open source load testing tool. Locust makes it easy to build out test scripts, and it allows you to crawl the site instead of using predefined URLs. Crawling the site has the added benefit of loading every page that is linked to anywhere on the site. This exposes edge case performance bottlenecks that would have gone undetected under tests with predifined URLs.


"makes it easy" -- link to example script?

The EOM team should be asked to update this section.

ari-gold · 2017-01-31T00:29:39Z

source/_docs/load-and-performance-testing.md

+
+  The Pantheon onboarding team uses Locust, an open source load testing tool. Locust makes it easy to build out test scripts, and it allows you to crawl the site instead of using predefined URLs. Crawling the site has the added benefit of loading every page that is linked to anywhere on the site. This exposes edge case performance bottlenecks that would have gone undetected under tests with predifined URLs.
+
+  Ultimately, it doesn't matter what tool you use as long as you to test your site properly. Be sure to allow for any authenticated traffic as well as anonymous.  


"Be sure to allow for any authenticated traffic as well as anonymous" - Not sure we should just assert this in passing. Load testing authenticated users can be difficult.

I agree that authenticated user testing is a complex task and thus the generic statement should be along the lines of "It is important for Load Testing to test against the anticipated traffic patterns of the site, both in terms of traffic volume and authenticated/anonymous proportion. Note that testing authenticated workflows is considerably more complex requiring more time, skills and iterations."

ari-gold · 2017-01-31T00:34:44Z

source/_docs/load-and-performance-testing.md

+
+3. Determine how much load to apply.
+
+  * **Performance Tests**: Smaller loads should suffice, as you should be able to see transactional bottlenecks with 10-20 concurrent users.


Why 10-20? A single request can give you all you need, no?
We should explain how to use tools like Google Dev Tools for website performance optimization or at least link to resources like:
https://hpbn.co/
https://www.udacity.com/course/website-performance-optimization--ud884

IMO, you want to generate more than single request to tease out potential bottlenecks.

Also, I know that we have a Quicksilver example that will use free loader.io account to automatically run this level of test on each push to Test environment. Not only does this result in automated testing procedures, it provides a standard profile that you can see in New Relic. Here's a related link, but we need better: pantheon-systems/quicksilver-examples#110

IMO, this is good to go (i.e. no edits needed). A separate issue should be created, if/when we want to include reference to the loader.io Quicksilver example.

ari-gold · 2017-01-31T00:36:21Z

source/_docs/load-and-performance-testing.md

-
-High-performance is the ability to deliver a page in under a second; scalability is the ability to deliver that page in under a second for many requests. It's important to understand the difference between these two dimensions and that there are trade-offs between performance and scalability.
-
-## Verify Varnish is Working


Seems like verifying Varnish is working is still important before doing a load test? Maybe this can be more concise?

Is this still the case now that Global CDN is in place?

rachelwhitton · 2017-03-16T22:38:58Z

We're going to deploy this as an iterative improvement and circle back to address suggestions not implemented here. See #2251 to track

Ben Routson and others added 3 commits January 18, 2017 15:25

Rewrite the performance test documentation

2616395

Merge branch 'load-test-1851' of https://github.com/bentekwork/docume…

3391233

…ntation into bentekwork-load-test-1851

Copy edits, first pass

8a36c5f

rachelwhitton added the WIP label Jan 26, 2017

rachelwhitton self-assigned this Jan 26, 2017

rachelwhitton added the review label Jan 26, 2017

rachelwhitton commented Jan 26, 2017

View reviewed changes

rachelwhitton and others added 3 commits January 26, 2017 12:19

Adjust headers

a3beb7c

Calculate capacity after launch

9dc4568

copy edits

9e575f2

adamedgmond approved these changes Jan 30, 2017

View reviewed changes

ari-gold suggested changes Jan 31, 2017

View reviewed changes

rachelwhitton assigned bentekwork and unassigned rachelwhitton Feb 1, 2017

rachelwhitton mentioned this pull request Mar 16, 2017

Implement Feedback on Load and Performance doc #2251

Closed

rachelwhitton merged commit 63a07f0 into master Mar 16, 2017

rachelwhitton mentioned this pull request Mar 16, 2017

Update load-and-performance-testing.md #2249

Closed

obicke mentioned this pull request Sep 8, 2017

Implement Feedback on Performance testing doc. #2870

Merged

2 tasks

alexfornuto deleted the bentekwork-load-test-1851 branch October 13, 2017 14:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor Load and Performance Doc #2140

Refactor Load and Performance Doc #2140

rachelwhitton commented Jan 26, 2017 •

edited

Loading

rachelwhitton Jan 26, 2017 •

edited

Loading

rachelwhitton Jan 26, 2017 •

edited

Loading

adamedgmond left a comment

ari-gold left a comment

ari-gold Jan 27, 2017

obicke Aug 16, 2017 •

edited

Loading

ari-gold Jan 27, 2017

obicke Aug 16, 2017

ari-gold Jan 30, 2017

obicke Sep 8, 2017

ari-gold Jan 30, 2017

ari-gold Jan 30, 2017

obicke Aug 16, 2017 •

edited

Loading

obicke Sep 8, 2017

ari-gold Jan 30, 2017

obicke Aug 16, 2017

ari-gold Jan 31, 2017

obicke Aug 16, 2017

obicke Sep 8, 2017

ari-gold Jan 31, 2017

obicke Aug 16, 2017

obicke Sep 8, 2017

ari-gold Jan 31, 2017

obicke Sep 8, 2017

rachelwhitton commented Mar 16, 2017


		Finally, review the Error analytics tab in New Relic. PHP errors often indicate huge performance bottlenecks. If you have errors, fix them.

		### Calculating Load Capacity After Launch


		The Pantheon onboarding team uses Locust, an open source load testing tool. Locust makes it easy to build out test scripts, and it allows you to crawl the site instead of using predefined URLs. Crawling the site has the added benefit of loading every page that is linked to anywhere on the site. This exposes edge case performance bottlenecks that would have gone undetected under tests with predifined URLs.

		Ultimately, it doesn't matter what tool you use as long as you to test your site properly. Be sure to allow for any authenticated traffic as well as anonymous.


		3. Determine how much load to apply.

		* Performance Tests: Smaller loads should suffice, as you should be able to see transactional bottlenecks with 10-20 concurrent users.


		High-performance is the ability to deliver a page in under a second; scalability is the ability to deliver that page in under a second for many requests. It's important to understand the difference between these two dimensions and that there are trade-offs between performance and scalability.

		## Verify Varnish is Working

Refactor Load and Performance Doc #2140

Refactor Load and Performance Doc #2140

Conversation

rachelwhitton commented Jan 26, 2017 • edited Loading

Effect

rachelwhitton Jan 26, 2017 • edited Loading

Choose a reason for hiding this comment

rachelwhitton Jan 26, 2017 • edited Loading

Choose a reason for hiding this comment

adamedgmond left a comment

Choose a reason for hiding this comment

ari-gold left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

obicke Aug 16, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

obicke Aug 16, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rachelwhitton commented Mar 16, 2017

rachelwhitton commented Jan 26, 2017 •

edited

Loading

rachelwhitton Jan 26, 2017 •

edited

Loading

rachelwhitton Jan 26, 2017 •

edited

Loading

obicke Aug 16, 2017 •

edited

Loading

obicke Aug 16, 2017 •

edited

Loading