Gentoo Archives: gentoo-server

From: Christian Parpart <trapni@g.o>
To: gentoo-server@l.g.o
Subject: [gentoo-server] DoS Analysis and Prevemption
Date: Mon, 15 Apr 2013 15:07:44
Message-Id: CA+qvzFOR1RJZUmDET-_3Sq86vD-hnX+95JTP9P0Q+Pa4Gu4Sqg@mail.gmail.com
1 Hey all,
2
3 we hit some nice traffic last night that took our main gateway down.
4 Pacemaker was configured to failover to our second one, but that one died
5 aswell.
6
7 In a little post-analysis, I found the following in the logs:
8
9 Apr 14 21:42:11 cesar1 kernel: [27613652.439846] BUG: soft lockup - CPU#4
10 stuck for 22s! [swapper/4:0]
11 Apr 14 21:42:11 cesar1 kernel: [27613652.440319] Stack:
12 Apr 14 21:42:11 cesar1 kernel: [27613652.440446] Call Trace:
13 Apr 14 21:42:11 cesar1 kernel: [27613652.440595] <IRQ>
14 Apr 14 21:42:12 cesar1 kernel: [27613652.440828] <EOI>
15 Apr 14 21:42:12 cesar1 kernel: [27613652.440979] Code: c1 51 da 03 81 48 c7
16 c2 4e da 03 81 e9 dd fe ff ff 90 90 90 90 90 90 90 90 90 90 90 90 90 55 b8
17 00 00 01 00 48 89 e5 f0 0f c1 07 <89> c2
18 Apr 14 21:42:12 cesar1 CRON[13599]: nss_ldap: could not connect to any LDAP
19 server as cn=admin,dc=rz,dc=dawanda,dc=com - Can't contact LDAP server
20 Apr 14 21:42:12 cesar1 CRON[13599]: nss_ldap: could not search LDAP server
21 - Server is unavailable
22 Apr 14 21:42:24 cesar1 crmd: [7287]: ERROR: process_lrm_event: LRM
23 operation management-gateway-ip1_stop_0 (917) Timed Out (timeout=20000ms)
24 Apr 14 21:42:48 cesar1 kernel: [27613688.611501] BUG: soft lockup - CPU#7
25 stuck for 22s! [named:32166]
26 Apr 14 21:42:48 cesar1 kernel: [27613688.611914] Stack:
27 Apr 14 21:42:48 cesar1 kernel: [27613688.612036] Call Trace:
28 Apr 14 21:42:48 cesar1 kernel: [27613688.612200] <IRQ>
29 Apr 14 21:42:48 cesar1 kernel: [27613688.612408] <EOI>
30 Apr 14 21:42:48 cesar1 kernel: [27613688.612626] Code: c1 51 da 03 81 48 c7
31 c2 4e da 03 81 e9 dd fe ff ff 90 90 90 90 90 90 90 90 90 90 90 90 90 55 b8
32 00 00 01 00 48 89 e5 f0 0f c1 07 <89> c2
33 Apr 14 21:42:55 cesar1 kernel: [27613695.946295] BUG: soft lockup - CPU#0
34 stuck for 21s! [ksoftirqd/0:3]
35
36 Apr 14 21:42:55 cesar1 kernel: [27613695.946785] Stack:
37 Apr 14 21:42:55 cesar1 kernel: [27613695.946917] Call Trace:
38 Apr 14 21:42:55 cesar1 kernel: [27613695.947137] Code: c4 00 00 81 a8 44 e0
39 ff ff ff 01 00 00 48 63 80 44 e0 ff ff a9 00 ff ff 07 74 36 65 48 8b 04 25
40 c8 c4 00 00 83 a8 44 e0 ff ff 01 <5d> c3
41
42 We're using irqbalance to not only hit the first CPU for ethernet card
43 hardware interrupts when traffic comes in (learned from last much more
44 intensive DDoS).
45 However, since this not helped, I'd like to find out what else we can do.
46 Our gateway has to do NAT and has a few other iptables rules it needs in
47 order to run OpenStack behind,
48 so I can't just drop it.
49
50 Regarding the logs, I can see, that something caused the CPU cores to get
51 stuck for a number of different processes.
52 Has anyone ever encountered such error messages I quoted above or knows
53 other things one might want to do in order to prevent hugh unsocialized
54 incoming traffic from bringing a Linux node down?
55
56 Best regards,
57 Christian.

Replies

Subject Author
Re: [gentoo-server] DoS Analysis and Prevemption Kerin Millar <kerframil@×××××××××××.uk>