1 |
On Thursday 29 January 2009 03:47:48 Tobias Klausmann wrote: |
2 |
> Hi! |
3 |
> |
4 |
> On Wed, 28 Jan 2009, Mike Frysinger wrote: |
5 |
> > > On the wire between the client and the firewall, this happens: |
6 |
> > > |
7 |
> > > a packet 1 is sent |
8 |
> > > b packet 2 is sent |
9 |
> > > c answer 1 is received |
10 |
> > > d answer 2 is received |
11 |
> > > |
12 |
> > > Sometimes d doesn't happen because b is lost in the firewall |
13 |
> > > along the way (where the race condition happens). |
14 |
> > |
15 |
> > does this affect actual userspace behavior ? in other words, |
16 |
> > does this lead to lost lookups and errors from the resolver ? |
17 |
> |
18 |
> The most visible effect (and the way we found out about it first) |
19 |
> is a 5s hang on ssh connects. |
20 |
|
21 |
this is why i turn off dns lookup in all my sshd_config's (well, not because |
22 |
of this bug, but because DNS lookup on ssh can cause annoying delays). plus, |
23 |
that info is largely useless: for the logged attempts from "bad" people, the |
24 |
dns is usually screwed up / wrong / unavailable anyways. |
25 |
|
26 |
> Thing is: how long that timeout is |
27 |
> is program dependant (whatever they use in select()). A recvfrom() |
28 |
> simply hangs. I wrote a simple C program to do what glibc does |
29 |
> (simplified for brevity): |
30 |
> ... |
31 |
|
32 |
so glibc will not trigger hangs, just delays in some cases. |
33 |
|
34 |
> A "quickfix" would indeed be using two different ports for the |
35 |
> queries - but the bug in Netfilter would still be there. |
36 |
|
37 |
sure, the bug still exists in netfilter (kernel). but if we can easily |
38 |
mitigate the effects seen by applications using glibc's resolver code, that |
39 |
seems sane to me. i havent perused the glibc resolver code in a while ... do |
40 |
you know if it can easily be tweaked to use different ports, or would such a |
41 |
change be invasive ? if the latter, well i guess we'll have to suck it up. |
42 |
-mike |