Skip to content


BUG: soft lockup – CPU#3 stuck for 10s!

故障现象 刚上的机器,网站不能访问.ssh也没反应

查看/var/log/message有一堆错误 Aug 19 21:35:23 bora kernel: BUG: soft lockup – CPU#3 stuck for 10s! [php-cgi:23997] Aug 19 21:35:23 bora kernel: CPU 3: Aug 19 21:35:23 bora kernel: Modules linked in: ip_conntrack_netbios_ns xt_state ip_conntrack nfnetlink iptable_filter ip_tables deflate zlib_deflate ccm serpent blowfish twofish ecb xcbc crypto_hash cbc md5 sha256 sha512 des aes_generic testmgr_cipher testmgr crypto_blkcipher aes_x86_64 ipcomp6 ipcomp ah6 ah4 esp6 xfrm6_esp esp4 xfrm4_esp aead crypto_algapi xfrm4_tunnel tunnel4 xfrm4_mode_tunnel xfrm4_mode_transport xfrm6_mode_transport xfrm6_mode_tunnel xfrm6_tunnel tunnel6 ipv6 xfrm_nalgo crypto_api af_key autofs4 hidp rfcomm l2cap bluetooth sunrpc ipt_REJECT xt_limit xt_tcpudp x_tables ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi scsi_transport_iscsi cpufreq_ondemand acpi_cpufreq freq_table dm_mirror dm_multipath scsi_dh video hwmon backlight sbs i2c_ec i2c_core button battery asus_acpi acpi_memhotplug ac parport_pc lp parport sr_mod cdrom serio_raw sg bnx2 pcspkr dm_raid45 dm_message dm_region_hash dm_log dm_mod dm_mem_cache ata_piix libata shpchp mptsas mptscsih mptbase sc Aug 19 21:35:23 bora kernel: i_transport_sas sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd Aug 19 21:35:23 bora kernel: Pid: 23997, comm: php-cgi Not tainted 2.6.18-128.el5 #1 Aug 19 21:35:23 bora kernel: RIP: 0010:[

] [] .text.lock.spinlock+0x2/0x30 Aug 19 21:35:23 bora kernel: RSP: 0000:ffff81013399bc68 EFLAGS: 00000282 Aug 19 21:35:23 bora kernel: RAX: ffff810224273c80 RBX: ffff810223ca7980 RCX: ffff810224273c80 Aug 19 21:35:23 bora kernel: RDX: 0000000000000000 RSI: 00000000000001f4 RDI: ffff810224273cc0 Aug 19 21:35:23 bora kernel: RBP: ffff81013399bbe0 R08: 0000000000000002 R09: 0000000000000000 Aug 19 21:35:23 bora kernel: R10: ffff81022b310680 R11: 0000000000000000 R12: ffffffff8005dc8e Aug 19 21:35:23 bora kernel: R13: ffff810224273c80 R14: ffffffff800774da R15: ffff81013399bbe0 Aug 19 21:35:23 bora kernel: FS: 00002b5378ee4c20(0000) GS:ffff81012fc4e6c0(0000) knlGS:0000000000000000 Aug 19 21:35:23 bora kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Aug 19 21:35:23 bora kernel: CR2: 00002b5380d78000 CR3: 0000000215b15000 CR4: 00000000000006e0 Aug 19 21:35:23 bora kernel: Aug 19 21:35:23 bora kernel: Call Trace: Aug 19 21:35:23 bora kernel: [] udp_rcv+0x431/0x5d1 Aug 19 21:35:23 bora kernel: [ ] ip_local_deliver+0x19d/0x263 Aug 19 21:35:23 bora kernel: [ ] ip_rcv+0x53a/0x57d Aug 19 21:35:23 bora kernel: [ ] netif_receive_skb+0x370/0x39c Aug 19 21:35:23 bora kernel: [ ] :bnx2:bnx2_poll_work+0xf7d/0x10b5 Aug 19 21:35:23 bora kernel: [ ] sk_free+0xc3/0x105 Aug 19 21:35:23 bora kernel: [ ] sched_balance_self+0x154/0x2f0 Aug 19 21:35:23 bora kernel: [ ] ip_local_deliver_finish+0x0/0x1e9 Aug 19 21:35:23 bora kernel: [ ] nf_hook_slow+0x58/0xbc Aug 19 21:35:23 bora kernel: [ ] ip_local_deliver+0x19d/0x263 Aug 19 21:35:23 bora kernel: [ ] ip_rcv+0x53a/0x57d Aug 19 21:35:23 bora kernel: [ ] :bnx2:bnx2_poll_msix+0x2e/0xc5 Aug 19 21:35:24 bora kernel: [ ] net_rx_action+0xa4/0x1a4 Aug 19 21:35:24 bora kernel: [ ] __do_softirq+0x89/0x133 Aug 19 21:35:24 bora kernel: [ ] call_softirq+0x1c/0x28 Aug 19 21:35:24 bora kernel: [ ] do_softirq+0x2c/0x85 Aug 19 21:35:24 bora kernel: [ ] do_IRQ+0xec/0xf5 Aug 19 21:35:24 bora kernel: [ ] ret_from_intr+0x0/0xa Aug 19 21:35:24 bora kernel: Aug 19 21:35:26 bora kernel: BUG: soft lockup – CPU#6 stuck for 10s! [pluto:3927] Aug 19 21:35:26 bora kernel: CPU 6: Aug 19 21:35:26 bora kernel: Modules linked in: ip_conntrack_netbios_ns xt_state ip_conntrack nfnetlink iptable_filter ip_tables deflate zlib_deflate ccm serpent blowfish twofish ecb xcbc crypto_hash cbc md5 sha256 sha512 des aes_generic testmgr_cipher testmgr crypto_blkcipher aes_x86_64 ipcomp6 ipcomp ah6 ah4 esp6 xfrm6_esp esp4 xfrm4_esp aead crypto_algapi xfrm4_tunnel tunnel4 xfrm4_mode_tunnel xfrm4_mode_transport xfrm6_mode_transport xfrm6_mode_tunnel xfrm6_tunnel tunnel6 ipv6 xfrm_nalgo crypto_api af_key autofs4 hidp rfcomm l2cap bluetooth sunrpc ipt_REJECT xt_limit xt_tcpudp x_tables ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi scsi_transport_iscsi cpufreq_ondemand acpi_cpufreq freq_table dm_mirror dm_multipath scsi_dh video hwmon backlight sbs i2c_ec i2c_core button battery asus_acpi acpi_memhotplug ac parport_pc lp parport sr_mod cdrom serio_raw sg bnx2 pcspkr dm_raid45 dm_message dm_region_hash dm_log dm_mod dm_mem_cache ata_piix libata shpchp mptsas mptscsih mptbase sc Aug 19 21:35:26 bora kernel: i_transport_sas sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd Aug 19 21:35:26 bora kernel: Pid: 3927, comm: pluto Not tainted 2.6.18-128.el5 #1 Aug 19 21:35:26 bora kernel: RIP: 0010:[ ] [] .text.lock.spinlock+0x2/0x30 Aug 19 21:35:26 bora kernel: RSP: 0018:ffff81012e5f5bb0 EFLAGS: 00000282 Aug 19 21:35:26 bora kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff810224273c80 Aug 19 21:35:26 bora kernel: RDX: 0000000000000000 RSI: 00000000000001e0 RDI: ffff810224273cc0 Aug 19 21:35:26 bora kernel: RBP: 0000000000000206 R08: ffff81012e5f5a38 R09: 0000000000000000 Aug 19 21:35:26 bora kernel: R10: ffff81012e5f5ab8 R11: 0000000000000048 R12: ffff810224273c80 Aug 19 21:35:26 bora kernel: R13: 0000100000000011 R14: 0000000400000000 R15: 0000000000000000 Aug 19 21:35:26 bora kernel: FS: 00002b4ce020bdb0(0000) GS:ffff81013397ce40(0000) knlGS:0000000000000000 Aug 19 21:35:26 bora kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Aug 19 21:35:26 bora kernel: CR2: 00002b5380d78000 CR3: 00000002269a2000 CR4: 00000000000006e0 Aug 19 21:35:26 bora kernel: Aug 19 21:35:26 bora kernel: Call Trace: Aug 19 21:35:26 bora kernel: [ ] release_sock+0x6b/0xaa Aug 19 21:35:26 bora kernel: [ ] udp_sendmsg+0x4de/0x5ce Aug 19 21:35:26 bora kernel: [ ] sock_sendmsg+0xf3/0x110 Aug 19 21:35:26 bora kernel: [ ] inode_has_perm+0x56/0x63 Aug 19 21:35:26 bora kernel: [ ] autoremove_wake_function+0x0/0x2e Aug 19 21:35:26 bora kernel: [ ] selinux_inode_getattr+0x50/0x5e Aug 19 21:35:26 bora kernel: [ ] _atomic_dec_and_lock+0x39/0x57 Aug 19 21:35:26 bora kernel: [ ] sys_sendto+0x11c/0x14f Aug 19 21:35:26 bora kernel: [ ] tracesys+0xd5/0xe0 Aug 19 21:35:26 bora kernel: Aug 19 21:35:33 bora kernel: BUG: soft lockup – CPU#3 stuck for 10s! [php-cgi:23997] Aug 19 21:35:33 bora kernel: CPU 3: Aug 19 21:35:33 bora kernel: Modules linked in: ip_conntrack_netbios_ns xt_state ip_conntrack nfnetlink iptable_filter ip_tables deflate zlib_deflate ccm serpent blowfish twofish ecb xcbc crypto_hash cbc md5 sha256 sha512 des aes_generic testmgr_cipher testmgr crypto_blkcipher aes_x86_64 ipcomp6 ipcomp ah6 ah4 esp6 xfrm6_esp esp4 xfrm4_esp aead crypto_algapi xfrm4_tunnel tunnel4 xfrm4_mode_tunnel xfrm4_mode_transport xfrm6_mode_transport xfrm6_mode_tunnel xfrm6_tunnel tunnel6 ipv6 xfrm_nalgo crypto_api af_key autofs4 hidp rfcomm l2cap bluetooth sunrpc ipt_REJECT xt_limit xt_tcpudp x_tables ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi scsi_transport_iscsi cpufreq_ondemand acpi_cpufreq freq_table dm_mirror dm_multipath scsi_dh video hwmon backlight sbs i2c_ec i2c_core button battery asus_acpi acpi_memhotplug ac parport_pc lp parport sr_mod cdrom serio_raw sg bnx2 pcspkr dm_raid45 dm_message dm_region_hash dm_log dm_mod dm_mem_cache ata_piix libata shpchp mptsas mptscsih mptbase sc 硬件: dell R410 E5504(四核) x2 4G x2 SAS 15k 146g x2 系统: CentOS 5.2 64bit nginx-0.7.61 php-5.2.10 mysql-5.1.37 eaccelerator-0.9.5.3 #top top – 14:03:42 up 16:11, 1 user, load average: 0.30, 0.40, 0.43 Tasks: 260 total, 1 running, 259 sleeping, 0 stopped, 0 zombie Cpu(s): 3.4%us, 1.0%sy, 0.0%ni, 94.8%id, 0.5%wa, 0.0%hi, 0.3%si, 0.0%st Mem: 8168412k total, 5068160k used, 3100252k free, 510276k buffers Swap: 4096532k total, 0k used, 4096532k free, 3251992k cached #cat /proc/version Linux version 2.6.18-128.el5 ([email protected]) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-44)) #1 SMP Wed Jan 21 10:41:14 EST 2009 # cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 26 model name : Intel(R) Xeon(R) CPU E5504 @ 2.00GHz stepping : 5 cpu MHz : 1596.000 cache size : 4096 KB physical id : 1 siblings : 4 core id : 0 cpu cores : 4 apicid : 16 fpu : yes fpu_exception : yes cpuid level : 11 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx rdtscp lm constant_tsc pni monitor ds_cpl vmx est tm2 cx16 xtpr popcnt lahf_lm bogomips : 3993.25 clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual #iptables -L Chain INPUT (policy ACCEPT) target prot opt source destination PING icmp — anywhere anywhere icmp echo-request state NEW ACCEPT all — anywhere anywhere DROP all — 127.0.0.0/8 anywhere DROP all — anywhere anywhere state INVALID DROP tcp — anywhere anywhere tcp flags:FIN,SYN,RST,PSH,ACK,URG/FIN,PSH,URG DROP tcp — anywhere anywhere tcp flags:FIN,SYN,RST,PSH,ACK,URG/FIN,SYN,RST,PSH,ACK,URG DROP tcp — anywhere anywhere tcp flags:FIN,SYN,RST,PSH,ACK,URG/FIN,SYN,RST,ACK,URG DROP tcp — anywhere anywhere tcp flags:FIN,SYN,RST,PSH,ACK,URG/NONE DROP tcp — anywhere anywhere tcp flags:SYN,RST/SYN,RST DROP tcp — anywhere anywhere tcp flags:FIN,SYN/FIN,SYN ACCEPT tcp — anywhere anywhere tcp dpt:smtp ACCEPT tcp — anywhere anywhere tcp dpt:http ACCEPT tcp — anywhere anywhere tcp dpt:mysql ACCEPT tcp — anywhere anywhere tcp dpt:webcache ACCEPT tcp — anywhere anywhere tcp dpt:15666 ACCEPT tcp — anywhere anywhere tcp dpt:ssh DROP tcp — anywhere anywhere tcp flags:FIN,SYN,RST,ACK/SYN Chain FORWARD (policy ACCEPT) target prot opt source destination Chain OUTPUT (policy ACCEPT) target prot opt source destination ACCEPT all — anywhere anywhere ACCEPT all — anywhere anywhere Chain PING (1 references) target prot opt source destination RETURN icmp — anywhere anywhere icmp echo-request limit: avg 1/sec burst 5 REJECT icmp — anywhere anywhere reject-with icmp-port-unreachable Chain SYNFLOOD (0 references) target prot opt source destination =========================== 故障原因好像是kernel-2.6.18-128有冲突,继续观察中 相关资料 http://bugs.centos.org/view.php?id=3582 https://bugzilla.redhat.com/show_bug.cgi?id=484590

Posted in LINUX, 技术.

Tagged with .


No Responses (yet)

Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.



Some HTML is OK

or, reply to this post via trackback.