Blocking for too long in SO_LINGER...

Steve Kann stevek en SteveK.COM
Jue Ene 27 18:08:44 CST 2000


While I was out to lunch (not mentally, but to get some food), I
realized that in addition to using the newer kernel, this machine is
also different from the other machines I've been using this software on
in that it is a SMP machine...

Are there some known SMP races in this area?

-SteveK

On Thu, Jan 27, 2000 at 12:44:29PM -0500, Steve Kann wrote:
> 
> 
> 	I saw a little bit of a conversation in the linux-kernel
> mailing list between Alan Cox and Steven Clarke, and it seems to
> describe the problem I'm having.  It seems that the kernel is blocking
> on close for sockets set to SO_LINGER for a _very_ long time..
> 
> 	http://www.uwsg.indiana.edu/hypermail/linux/kernel/9909.2/0350.html
> 
> 	Even where the LINGER was set (elsewhere in the code) it was set
> 	to 120 (units - seconds? - claims to be hundreths of seconds in
> 	the setsockopt man page but I think that's wrong). In the
> 	problem that I demonstrated, close took 11 minutes to return! 
> 
> I've been using pnserver (crappy software) for a while on linux 2.0.x,
> and it's for the most part worked without any problem.  I have it now
> set up on a new 2.2.12-20 (redhat kernel) machine, and it sometimes just
> locks up for many minutes.
> 
> I did a strace on the process, and found that it was blocking on the
> close() of a socket with SO_LINGER set.  I'm not sure if strace is
> reporting this properly, but I think that the timeout being set is 8
> seconds (which seems like an awfully bad choice by real if so).
> 
> Anyways, I've fouind it actually blocking for _much_ longer than 8
> seconds -- in fact, in once case, I
> 
> Here it is blocking for about 130 seconds:
> 
> setsockopt(23, SOL_SOCKET, SO_LINGER, [1], 8) = 0
> time([949008208])                       = 949008208
> write(1, "170.153.121.76 - - [27/Jan/2000:"..., 155) = 155
> close(23)                               = 0
> send(14, "?", 1, 0)                     = 1
> oldselect(39, [4 5 6 8 9 10 13 14 16 17 21], [], NULL, {0, 46775}) = 9 (in [4 8 9 10 13 14 16 17 21], left {0, 50000})
> gettimeofday({949008338, 669581}, NULL) = 0
> 
> 
> And here, I figure I better get this restarted after about 10 minutes,
> before the people listening to the streams pack up and leave:
> 
> 16:53:28.079382 setsockopt(13, SOL_SOCKET, SO_LINGER, [1], 8) = 0
> 16:53:28.079638 time([949010008])       = 949010008
> 16:53:28.079861 write(1, "170.153.121.102 - - [27/Jan/2000:21:53:28 +0000] \"GET c1/use"..., 163) = 163
> 16:53:28.080184 close(13)               = 0
> 17:03:07.471132 --- SIGHUP (Hangup) ---
> 17:03:07.472156 gettimeofday({949010587, 472268}, NULL) = 0
> 
> 
> Any idea what is happening here?  Any kind of quick-fix I can do here:
> (I have a high-proofile event tomorrow, and naturally, I don't have the
> source to the silly application)..  Maybe I can quickly get the kernel
> to ignore SO_LINGER either entirely (and I guess I'll find out what
> breaks), or for just particular processes.
> 
> 
> -SteveK
> 
> 
> -- 
>     Steve Kann  - Horizon Live Distance Learning - 841 Broadway, Suite 502
>    P:stevek en SteveK.COM - B:stevek en HorizonLive.com - R:KC2FBU (212) 533-1775
>   "The box said 'Requires Windows 95, NT, or better,' so I installed Linux."
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo en vger.rutgers.edu
> Please read the FAQ at http://www.tux.org/lkml/
> 

-- 
    Steve Kann  - Horizon Live Distance Learning - 841 Broadway, Suite 502
   P:stevek en SteveK.COM - B:stevek en HorizonLive.com - R:KC2FBU (212) 533-1775
  "The box said 'Requires Windows 95, NT, or better,' so I installed Linux."

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo en vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



Más información sobre la lista de distribución Ayuda