RPS is lower when queue-proxy in request path #15627

jokerwenxiao · 2024-11-25T09:06:57Z

same question like #10085
When curl pod-ip:user-container-port directly, rps are normal, but when curl pod-ip:queue-proxy-port, rps is lower

root@master:~# hey -z 60s -c 70 http://172.22.28.105

Summary:
  Total:        60.0059 secs
  Slowest:      0.2129 secs
  Fastest:      0.0007 secs
  Average:      0.0064 secs
  Requests/sec: 10906.5445

  Total data:   53665474 bytes
  Size/request: 82 bytes

Response time histogram:
  0.001 [1]     |
  0.022 [654035]        |■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
  0.043 [313]   |
  0.064 [74]    |
  0.086 [3]     |
  0.107 [1]     |
  0.128 [11]    |
  0.149 [13]    |
  0.170 [0]     |
  0.192 [5]     |
  0.213 [1]     |


Latency distribution:
  10% in 0.0057 secs
  25% in 0.0061 secs
  50% in 0.0064 secs
  75% in 0.0066 secs
  90% in 0.0070 secs
  95% in 0.0072 secs
  99% in 0.0090 secs

Details (average, fastest, slowest):
  DNS+dialup:   0.0000 secs, 0.0007 secs, 0.2129 secs
  DNS-lookup:   0.0000 secs, 0.0000 secs, 0.0000 secs
  req write:    0.0000 secs, 0.0000 secs, 0.0071 secs
  resp wait:    0.0063 secs, 0.0004 secs, 0.2128 secs
  resp read:    0.0001 secs, 0.0000 secs, 0.0280 secs

Status code distribution:
  [200] 654457 responses



root@master:~# hey -z 60s -c 70 http://172.22.28.105:8012

Summary:
  Total:        60.0100 secs
  Slowest:      0.2062 secs
  Fastest:      0.0011 secs
  Average:      0.0112 secs
  Requests/sec: 6232.7635

  Total data:   30670296 bytes
  Size/request: 82 bytes

Response time histogram:
  0.001 [1]     |
  0.022 [354717]        |■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
  0.042 [19194] |■■
  0.063 [38]    |
  0.083 [35]    |
  0.104 [1]     |
  0.124 [2]     |
  0.145 [1]     |
  0.165 [2]     |
  0.186 [28]    |
  0.206 [9]     |


Latency distribution:
  10% in 0.0062 secs
  25% in 0.0073 secs
  50% in 0.0095 secs
  75% in 0.0141 secs
  90% in 0.0190 secs
  95% in 0.0217 secs
  99% in 0.0257 secs

Details (average, fastest, slowest):
  DNS+dialup:   0.0000 secs, 0.0011 secs, 0.2062 secs
  DNS-lookup:   0.0000 secs, 0.0000 secs, 0.0000 secs
  req write:    0.0000 secs, 0.0000 secs, 0.0097 secs
  resp wait:    0.0111 secs, 0.0010 secs, 0.2061 secs
  resp read:    0.0000 secs, 0.0000 secs, 0.0282 secs

Status code distribution:
  [200] 374028 responses


queue-proxy resource config
  queue-sidecar-cpu-limit: 20000m
  queue-sidecar-cpu-request: 20000m
  queue-sidecar-ephemeral-storage-limit: 2048Mi
  queue-sidecar-ephemeral-storage-request: 2048Mi
  queue-sidecar-memory-limit: 2048Mi
  queue-sidecar-memory-request: 2048Mi

In order to improve the RPS of queue-proxy, I have set the resource allocation of the queue proxy container very high. If I follow the default queue-proxy resource usage in the config-deployment configmap, rqs will only be less than 100

knative version：v1.1.2

The text was updated successfully, but these errors were encountered:

skonto · 2024-11-26T11:46:18Z

Hi @jokerwenxiao in cases of cpu contention you may have to configure QP with higher resources, see this old ticket as well. What is the user container doing (is it a helloworld app or something more computational expensive)? What resources are assigned to the user container? What is the cpu utilization at the node where the pod is running?
Btw when you are hitting the QP your request goes through one more hope and two containers are utilizing the cpu, so there is some overhead anyway.

jokerwenxiao · 2024-11-27T01:03:28Z

Hi @skonto
this is my user-container code:

// main.go
package main

import (
	"github.com/valyala/fasthttp"
	"log"
)


func requestHandler(ctx *fasthttp.RequestCtx) {
	args := ctx.QueryArgs()
	hostname := os.Getenv("HOSTNAME")
	ctx.WriteString("response from host " + hostname + ", query parameter is " + string(args.Peek("param")))
}

func main() {
	address := ":80"
	log.Printf("Starting server on %s", address)
	if err := fasthttp.ListenAndServe(address, requestHandler); err != nil {
		log.Fatalf("Error starting server: %s", err)
	}
}

user-container resource：

    Limits:
      cpu:     2
      memory:  4G
    Requests:
      cpu:     2
      memory:  4G

user-container cpu utilization：

skonto · 2024-11-27T11:51:32Z

Could you run: kubectl describe node <node> and list the running pods utilization? How many cpus you have on the node? Could you also run the user container with lower resources and report back (it seems you are allocating a lot for the user container). In general QP does several stuff eg. proxying, draining requests, emitting metrics etc. That means there is a penalty to pay and that is why queue.sidecar.serving.knative.dev/resource-percentage was introduced in the past with some upper bound to be flexible with what resources need to be allocated compared to the user container ones.

jokerwenxiao added the kind/question Further information is requested label Nov 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RPS is lower when queue-proxy in request path #15627

RPS is lower when queue-proxy in request path #15627

jokerwenxiao commented Nov 25, 2024 •

edited

Loading

skonto commented Nov 26, 2024 •

edited

Loading

jokerwenxiao commented Nov 27, 2024

skonto commented Nov 27, 2024 •

edited

Loading

RPS is lower when queue-proxy in request path #15627

RPS is lower when queue-proxy in request path #15627

Comments

jokerwenxiao commented Nov 25, 2024 • edited Loading

skonto commented Nov 26, 2024 • edited Loading

jokerwenxiao commented Nov 27, 2024

skonto commented Nov 27, 2024 • edited Loading

jokerwenxiao commented Nov 25, 2024 •

edited

Loading

skonto commented Nov 26, 2024 •

edited

Loading

skonto commented Nov 27, 2024 •

edited

Loading