原先有tcp超时现像(WGET 十几次后会有超时),插拨网线并重启后已恢复正常。
目前一小时近2W人在线,留下些操作记录
系统信息
======================================
dell R410
5504(4核2G) *2 4G*2 sas15K 146*2
Intel(R) Xeon(R) CPU E5504 @ 2.00GHz
centos 5.2 64bit
nginx+php+mysql
nginx 6个进程
php 96个进程
Linux bora 2.6.18-128.el5 #1 SMP Wed Jan 21 10:41:14 EST 2009 x86_64 x86_64 x86_64 GNU/Linux
sysctl内核
net.ipv4.tcp_max_syn_backlog = 65536
net.core.netdev_max_backlog = 32768
net.core.somaxconn = 32768
net.core.wmem_default = 8388608
net.core.rmem_default = 8388608
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_timestamps = 0
net.ipv4.tcp_synack_retries = 2
net.ipv4.tcp_syn_retries = 2
net.ipv4.tcp_tw_recycle = 1
#net.ipv4.tcp_tw_len = 1
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_mem = 94500000 915000000 927000000
net.ipv4.tcp_max_orphans = 3276800
#net.ipv4.tcp_fin_timeout = 30
#net.ipv4.tcp_keepalive_time = 120
net.ipv4.ip_local_port_range = 1024 65535
优化文件句柄
vi /etc/security/limits.conf
* soft nofile 51200
* hard nofile 51200
vi /etc/rc.local
ulimit -SHn 51200
===================
Active connections: 2419
server accepts handled requests
73668795 73668795 232420556
Reading: 11 Writing: 28 Waiting: 2380
在线会员 – 总计 13433 人在线
top – 13:29:01 up 33 days, 22:53, 2 users, load average: 1.22, 1.60,
Tasks: 265 total, 1 running, 264 sleeping, 0 stopped, 0 zombie
Cpu(s): 9.3%us, 1.4%sy, 0.0%ni, 88.2%id, 0.5%wa, 0.0%hi, 0.6%si,
Mem: 8168412k total, 6691148k used, 1477264k free, 917728k buffe
Swap: 4096532k total, 228k used, 4096304k free, 3841696k cache
netstat -n | awk ‘/^tcp/ {++S[$NF]} END {for(a in S) print a, S[a]}’
TIME_WAIT 4845
SYN_SENT 1
FIN_WAIT1 185
ESTABLISHED 2698
FIN_WAIT2 381
SYN_RECV 162
CLOSING 5
LAST_ACK 137
netstat -n |wc -l
8441
修改tcp_no_metrics_save
默认情况下一个tcp连接关闭后,把这个连接曾经有的参数比如慢启动门限snd_sthresh,拥塞窗口snd_cwnd 还有srtt等信息保存到dst_entry中, 只要dst_entry 没有失效,下次新建立相同连接的时候就可以使用保存的参数来初始化这个连接.通常情况下是关闭的。
echo ‘1’ > /proc/sys/net/ipv4/tcp_no_metrics_save
依然超时
vi /etc/sysctl.conf
增加一行
net.ipv4.tcp_no_metrics_save =1
sysctl -p
开启5分钟后系统负载有小额上升
Active connections: 2520
server accepts handled requests
73715479 73715479 232593589
Reading: 11 Writing: 14 Waiting: 2495
top – 13:46:30 up 33 days, 23:10, 2 users, load average: 1.49, 1.74, 1.64
Tasks: 265 total, 3 running, 262 sleeping, 0 stopped, 0 zombie
Cpu(s): 13.8%us, 2.4%sy, 0.0%ni, 82.9%id, 0.1%wa, 0.1%hi, 0.7%si, 0.0
Mem: 8168412k total, 7225928k used, 942484k free, 925236k buffers
Swap: 4096532k total, 228k used, 4096304k free, 3893016k cached
vi /etc/sysctl.conf
修改
net.ipv4.tcp_no_metrics_save =0
sysctl -p
5分钟后负载下载,net.ipv4.tcp_no_metrics_save 不是很管用
top – 13:52:18 up 33 days, 23:16, 2 users, load average: 1.18, 1.46, 1.56
Tasks: 265 total, 1 running, 264 sleeping, 0 stopped, 0 zombie
Cpu(s): 12.9%us, 2.1%sy, 0.0%ni, 83.5%id, 0.8%wa, 0.1%hi, 0.6%si, 0.0
Mem: 8168412k total, 7274212k used, 894200k free, 927504k buffers
Swap: 4096532k total, 228k used, 4096304k free, 3911832k cached
=======================================
cat /proc/sys/net/ipv4/tcp_fin_timeout
60
cat /proc/sys/net/ipv4/tcp_keepalive_time
7200
vi /etc/sysctl.conf
net.ipv4.tcp_fin_timeout = 30
net.ipv4.tcp_keepalive_time = 120
sysctl -p
========================
ifconfig eth0 txqueuelen 1000
看了下已经是1000了
ifconfig
TX packets:1924158031 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:339974963014 (316.6 GiB) TX bytes:2165156209254 (1.9 TiB)
================================
cat /proc/sys/net/ipv4/netfilter/ip_conntrack_max
65536
=========================
cat /proc/sys/fs/file-nr
4590 0 765985
=======================================
cat /proc/sys/net/ipv4/route/gc_interval
60
cat /proc/sys/net/ipv4/route/gc_timeout
300
cat /proc/sys/net/ipv4/route/gc_elasticity
8
eaccelerator
[eaccelerator]
zend_extension=”/opt/php/lib/php/extensions/no-debug-non-zts-20060613/eaccelerator.so”
eaccelerator.shm_size=”32″
eaccelerator.cache_dir=”/opt/php/eaccelerator_cache”
eaccelerator.enable=”1″
eaccelerator.optimizer=”1″
eaccelerator.check_mtime=”1″
eaccelerator.debug=”0″
eaccelerator.filter=””
eaccelerator.shm_max=”0″
eaccelerator.shm_ttl=”3600″
eaccelerator.shm_prune_period=”3600″
eaccelerator.shm_only=”0″
eaccelerator.compress=”1″
eaccelerator.compress_level=”9″
nginx
user www website;
worker_processes 6;
error_log /var/log/nginx/nginx_error.log crit;
pid /dev/shm/nginx.pid;
#Specifies the value for maximum file descriptors that can be opened by this process.
worker_rlimit_nofile 51200;
events
{
use epoll;
worker_connections 51200;
}
http
{
include mime.types;
default_type application/octet-stream;
log_format access ‘$remote_addr – $remote_user [$time_local] “$request” ‘
‘$status $body_bytes_sent “$http_referer” ‘
‘”$http_user_agent” $http_x_forwarded_for’;
server_names_hash_bucket_size 128;
client_header_buffer_size 32k;
large_client_header_buffers 4 32k;
client_body_timeout 60;
client_max_body_size 8m;
#linux 2.4+
sendfile on;
tcp_nopush on;
tcp_nodelay on;
server_name_in_redirect off;
keepalive_timeout 60;
fastcgi_intercept_errors on;
fastcgi_hide_header X-Powered-By;
fastcgi_connect_timeout 180;
fastcgi_send_timeout 180;
fastcgi_read_timeout 180;
fastcgi_buffer_size 128k;
fastcgi_buffers 4 128K;
fastcgi_busy_buffers_size 128k;
fastcgi_temp_file_write_size 128k;
fastcgi_temp_path /dev/shm;
gzip on;
gzip_min_length 1k;
gzip_comp_level 5;
gzip_buffers 4 16k;
gzip_http_version 1.1;
gzip_types text/plain application/x-javascript text/css application/xml;
limit_zone one $binary_remote_addr 10m;
server
{
listen 80;
server_name bbs.xxx.com *.bbs.xxx.com;
index index.html index.htm index.php;
root /opt/lampp/htdocs/bbs;
error_page 404 403 /404.html;
location ~/\.ht {
deny all;
}
location ~ /bbs/attachment\.php?$ {
include fcgi.conf;
fastcgi_pass 127.0.0.1:9000;
fastcgi_index index.php;
limit_conn one 1;
limit_rate 30k;
}
location ~ .*\.php?$
{
#fastcgi_pass unix:/tmp/php-cgi.sock;
fastcgi_pass 127.0.0.1:9000;
fastcgi_index index.php;
include fcgi.conf;
}
rewrite ^(.*)/archiver/((fid|tid)-[\w\-]+\.html)$ $1/archiver/index.php?$2 last;
rewrite ^(.*)/forum-([0-9]+)-([0-9]+)\.html$ $1/forumdisplay.php?fid=$2&page=$3 last;
rewrite ^(.*)/thread-([0-9]+)-([0-9]+)-([0-9]+)\.html$ $1/viewthread.php?tid=$2&extra=page\%3D$4&page=$3 last;
rewrite ^(.*)/profile-(username|uid)-(.+)\.html$ $1/viewpro.php?$2=$3 last;
rewrite ^(.*)/space-(username|uid)-(.+)\.html$ $1/space.php?$2=$3 last;
location ~(favicon.ico) {
log_not_found off;
expires 99d;
break;
}
location ~(robots.txt) {
log_not_found off;
expires 7d;
break;
}
location ~* ^.+\.(jpg|jpeg|gif|png|swf|rar|zip|css|js)$ {
valid_referers none blocked *.xxx.com *.xxx.net localhost;
if ($invalid_referer) {
rewrite ^/ ;
return 412;
}
access_log off;
root /opt/lampp/htdocs/bbs;
expires 7d;
break;
}
access_log /var/log/nginx/bbs.xxx.com.log access;
}
}
==================
总计 18518 人在线
top – 17:40:04 up 5:49, 2 users, load average: 1.81, 2.08, 2.20
Tasks: 265 total, 6 running, 259 sleeping, 0 stopped, 0 zombie
Cpu(s): 26.0%us, 3.4%sy, 0.0%ni, 69.0%id, 0.7%wa, 0.0%hi, 0.9%si,
Mem: 8168412k total, 7173156k used, 995256k free, 486980k buffe
Swap: 4096532k total, 0k used, 4096532k free, 3559140k cache
Active connections: 2306
server accepts handled requests
297904 297904 980053
Reading: 7 Writing: 10 Waiting: 2289
netstat -n|wc -l
8157
sysctl
net.ipv4.ip_forward = 0
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.default.accept_source_route = 0
kernel.sysrq = 0
kernel.core_uses_pid = 1
net.ipv4.tcp_syncookies = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.shmmax = 68719476736
kernel.shmall = 4294967296
net.ipv4.tcp_max_syn_backlog = 65536
net.core.netdev_max_backlog = 32768
net.core.somaxconn = 32768
net.ipv4.tcp_max_tw_buckets = 10000
net.core.wmem_default = 8388608
net.core.rmem_default = 8388608
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_timestamps = 0
net.ipv4.tcp_synack_retries = 2
net.ipv4.tcp_syn_retries = 2
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_mem = 196608 262144 393216
net.ipv4.tcp_max_orphans = 3276800
net.ipv4.tcp_fin_timeout = 30
net.ipv4.tcp_keepalive_time = 120
net.ipv4.ip_local_port_range = 1024 65535
net.ipv4.tcp_syncookies = 1
===================================
2009/11/2更新
近日网络有些不稳定,系统日志中有下列信息
tail /var/log/messages
Nov 1 19:37:32 bora kernel: ip_conntrack: table full, dropping packet.
Nov 1 19:38:31 bora kernel: ip_conntrack: table full, dropping packet.
Nov 1 19:43:15 bora kernel: ip_conntrack: table full, dropping packet.
Nov 1 20:38:31 bora kernel: ip_conntrack: table full, dropping packet.
Nov 1 20:42:16 bora last message repeated 2 times
Nov 1 20:51:42 bora last message repeated 2 times
Nov 1 21:23:38 bora last message repeated 3 times
Nov 1 21:28:14 bora last message repeated 2 times
cat /var/log/messages |grep ‘ip_conntrack: table full’
cat /var/log/messages |grep ‘syslogd 1.4.1: restart’
发现个规律
ip_conntrack: table full信息一般隔5天出现,一天后系统就出现假死,然后需手工重启。
应该就是ip_conntrack的问题
查看当前的参数
cat /proc/sys/net/ipv4/ip_conntrack_max
65536
65536为系统默认1G内存的数值
ip_conntrack_max 计算公式
参考:http://www.wallfire.org/misc/netfilter_conntrack_perf.txt
CONNTRACK_MAX = RAMSIZE (in bytes) / 16384 / (x / 32)
#where x is the number of bits in a pointer (for example, 32 or 64 bits)
1G内存的话:1024*1024*1024/16384/(32/32)=65536
我的配值为8G 64bit
8192*1024*1024/16384/(64/32)=262144
查看RAMSIZE
cat /proc/sys/net/ipv4/netfilter/ip_conntrack_buckets
8192
查看ip_conntrack timeout
cat /proc/sys/net/ipv4/netfilter/ip_conntrack_tcp_timeout_established
432000
#432000(5天)改成36000 (10小时)
vi /etc/sysctl.conf
修改内核在尾部增加两行
net.ipv4.ip_conntrack_max = 262144
net.ipv4.netfilter.ip_conntrack_tcp_timeout_established = 36000
立即生效
sysctl -p
No Responses (yet)
Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.