前言
生产环境遇到了并发产生的错误:服务在被调用的过程中超时。
排查后原因是flask+uwsgi部署方案中的请求积压队列大小默认只有100,所以如果短时间服务收到大量的请求并没来得及处理,积压的队列如果超过了100,后续的请求系统就不再响应了。
Note that a “listen backlog” of 100 connections doesn’t mean that your server can only handle 100 simultaneous (or total) connections - this is instead dependent on the number of configured processes or threads. The listen backlog is a socket setting telling the kernel how to limit the number of outstanding (as yet unaccapted) connections in the listen queue of a listening socket. If the number of pending connections exceeds the specified size, new ones are automatically rejected. A functioning server regularly servicing its connections should not require a large backlog size.
所以可以从两个角度来优化:
(1) 增加积压队列的大小,从100调整到1000
(2) 增加flask的处理速度,调高进程的数量
处理方案
1.修改uwsgi的文件1
2
3
4
5
6[uwsgi]
module = main
callable = app
processes = 8
master = 1
listen = 10240
调整processes
的大小为8,listen的大小为10240
2.修改主机的网络连接限制1
sysctl -w net.core.somaxconn=10240
对于docker容器部署,则是设置容器中的内核参数
对于docker-compose:
https://www.runoob.com/docker/docker-compose.html
1 | sysctls: |
对于k8s:
https://kubernetes.io/zh/docs/tasks/administer-cluster/sysctl-cluster/
1 | apiVersion: v1 |