Blocking for too long in SO_LINGER...

Steve Kann stevek en SteveK.COM
Jue Ene 27 14:55:32 CST 2000



	I saw a little bit of a conversation in the linux-kernel
mailing list between Alan Cox and Steven Clarke, and it seems to
describe the problem I'm having.  It seems that the kernel is blocking
on close for sockets set to SO_LINGER for a _very_ long time..

	http://www.uwsg.indiana.edu/hypermail/linux/kernel/9909.2/0350.html

	Even where the LINGER was set (elsewhere in the code) it was set
	to 120 (units - seconds? - claims to be hundreths of seconds in
	the setsockopt man page but I think that's wrong). In the
	problem that I demonstrated, close took 11 minutes to return! 

I've been using pnserver (crappy software) for a while on linux 2.0.x,
and it's for the most part worked without any problem.  I have it now
set up on a new 2.2.12-20 (redhat kernel) machine, and it sometimes just
locks up for many minutes.

I did a strace on the process, and found that it was blocking on the
close() of a socket with SO_LINGER set.  I'm not sure if strace is
reporting this properly, but I think that the timeout being set is 8
seconds (which seems like an awfully bad choice by real if so).

Anyways, I've fouind it actually blocking for _much_ longer than 8
seconds -- in fact, in once case, I

Here it is blocking for about 130 seconds:

setsockopt(23, SOL_SOCKET, SO_LINGER, [1], 8) = 0
time([949008208])                       = 949008208
write(1, "170.153.121.76 - - [27/Jan/2000:"..., 155) = 155
close(23)                               = 0
send(14, "?", 1, 0)                     = 1
oldselect(39, [4 5 6 8 9 10 13 14 16 17 21], [], NULL, {0, 46775}) = 9 (in [4 8 9 10 13 14 16 17 21], left {0, 50000})
gettimeofday({949008338, 669581}, NULL) = 0


And here, I figure I better get this restarted after about 10 minutes,
before the people listening to the streams pack up and leave:

16:53:28.079382 setsockopt(13, SOL_SOCKET, SO_LINGER, [1], 8) = 0
16:53:28.079638 time([949010008])       = 949010008
16:53:28.079861 write(1, "170.153.121.102 - - [27/Jan/2000:21:53:28 +0000] \"GET c1/use"..., 163) = 163
16:53:28.080184 close(13)               = 0
17:03:07.471132 --- SIGHUP (Hangup) ---
17:03:07.472156 gettimeofday({949010587, 472268}, NULL) = 0


Any idea what is happening here?  Any kind of quick-fix I can do here:
(I have a high-proofile event tomorrow, and naturally, I don't have the
source to the silly application)..  Maybe I can quickly get the kernel
to ignore SO_LINGER either entirely (and I guess I'll find out what
breaks), or for just particular processes.


-SteveK


-- 
    Steve Kann  - Horizon Live Distance Learning - 841 Broadway, Suite 502
   P:stevek en SteveK.COM - B:stevek en HorizonLive.com - R:KC2FBU (212) 533-1775
  "The box said 'Requires Windows 95, NT, or better,' so I installed Linux."

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo en vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



Más información sobre la lista de distribución Ayuda