Skip to content

Milestone 2.1 With 1 pod.

GowthamCh edited this page Mar 8, 2022 · 3 revisions

System Details: All the below tests are tested on the local machine with a Memory limit of 2GB and 2 CPUs.

Starting Kubernetes and each service is scaled to one container: image image

Testing results of weather forecast service:

Summary:

No of users Duration Success rate Comment
100 10 100% Application is stable
500 10 20% Application was down
100 60 100% Application is stable
500 60 93% Application became unresponsive for few seconds
1000 60 61% Application was down for some time due to massive load

We are using a third-party API(Open Weather API) internally for forecasting certain weather data. Since we are using the basic (unpaid) version in our application the results depend on the responsiveness of the external API. The weather forecasting service with just one pod/replica was successfully able to handle 100 users in 60 seconds.

Detailed Analysis:

100 users rampup time 60 seconds.

100 % Success rate.
image

Response Time Graph:
image

With 100 users in 60 seconds there are no failures.


500 users rampup time 60 seconds.

Summary.
image

Statistics:
image

Response Time:
image

With this load, the weather forecasting service started to crash. Out of 500 users -> 36 users faced this issue.


1000 users rampup time 60 seconds.

Summary:
image

Statistics:
image

Response Time:
image

Total Transactions: Pass vs Fail
image

With 1000 users in 60 seconds the weather forecasting service was down several times and consequently, the response times for certain users increased.


Plotting Service:

Testing results of Plotting and download service:

Summary:

No of users Duration Success rate Comment
100 10 100% Application is stable
1000 10 74% Application became unresponsive for few seconds
1000 100 100% Application is stable but became bit laggy
3000 100 61% Application was down frequently at intermittent intervals for some time due to massive load

Note: The reason the response times is better and sometimes less than 1-10 seconds is because of the local cache. Every time an image is processed we are storing the information of the metadata (start_date, end_date, station ) with the accessible public URL for the plot. Hence only the initial requests take time and for the subsequent requests on the same data, it is fetched directly from the local database without performing the memory intensive plotting again.

100 users 10 seconds.

image

Success rate - 100%
Response Time:
image

1000 users in 10 Seconds:

Requests Summary: image

Transactions per Seconds:
image

Statistics:
image

1000 users in 100 seconds:

Statistics: image Response Times: image

Obs: 1000 users in 10 seconds is a massive load for a 2 GB machine with 2 cpus to handle lot of memory intensive tasks. But with 1000 users in 100 seconds was performing well with out any errors as the Transactions/second are reduced by a factor of 10.

3000 users in 100 seconds:

Response over time:
image

This is an interesting graph, this is similar to the one in the lecture on the Zookeeper (cluster of 5 zookeeper servers with manually injected failures). Here as the load increased gradually the Kubernetes pod got crashed and was restarted. Once the pod was restarted the application became stable for some time and eventually faced the same load issue again.

Conclusion:

  • Spike testing is performed on the plotting and data retrieval service by using smaller ramp up times. Huge number of requests are sent in just 10 seconds. As we seen above the maximum number of users the application was able to handle in 10 seconds were nearly 100-150.
  • Load testing is performed on the plotting and data retrieval service with increasing load. We tested with 100, 500 and up to 5k concurrent users. The plotting service was able to handle 1000 users in 100 seconds where as the weather forecast service was able to handle 100 users in 60 seconds.
  • With one pod the results are not that great as the pod crashes all the incoming requests are lost until the pod comes back alive.
Clone this wiki locally