Blocking for too long in SO_LINGER...
Steve Kann
stevek en SteveK.COM
Jue Ene 27 18:08:44 CST 2000
While I was out to lunch (not mentally, but to get some food), I
realized that in addition to using the newer kernel, this machine is
also different from the other machines I've been using this software on
in that it is a SMP machine...
Are there some known SMP races in this area?
-SteveK
On Thu, Jan 27, 2000 at 12:44:29PM -0500, Steve Kann wrote:
>
>
> I saw a little bit of a conversation in the linux-kernel
> mailing list between Alan Cox and Steven Clarke, and it seems to
> describe the problem I'm having. It seems that the kernel is blocking
> on close for sockets set to SO_LINGER for a _very_ long time..
>
> http://www.uwsg.indiana.edu/hypermail/linux/kernel/9909.2/0350.html
>
> Even where the LINGER was set (elsewhere in the code) it was set
> to 120 (units - seconds? - claims to be hundreths of seconds in
> the setsockopt man page but I think that's wrong). In the
> problem that I demonstrated, close took 11 minutes to return!
>
> I've been using pnserver (crappy software) for a while on linux 2.0.x,
> and it's for the most part worked without any problem. I have it now
> set up on a new 2.2.12-20 (redhat kernel) machine, and it sometimes just
> locks up for many minutes.
>
> I did a strace on the process, and found that it was blocking on the
> close() of a socket with SO_LINGER set. I'm not sure if strace is
> reporting this properly, but I think that the timeout being set is 8
> seconds (which seems like an awfully bad choice by real if so).
>
> Anyways, I've fouind it actually blocking for _much_ longer than 8
> seconds -- in fact, in once case, I
>
> Here it is blocking for about 130 seconds:
>
> setsockopt(23, SOL_SOCKET, SO_LINGER, [1], 8) = 0
> time([949008208]) = 949008208
> write(1, "170.153.121.76 - - [27/Jan/2000:"..., 155) = 155
> close(23) = 0
> send(14, "?", 1, 0) = 1
> oldselect(39, [4 5 6 8 9 10 13 14 16 17 21], [], NULL, {0, 46775}) = 9 (in [4 8 9 10 13 14 16 17 21], left {0, 50000})
> gettimeofday({949008338, 669581}, NULL) = 0
>
>
> And here, I figure I better get this restarted after about 10 minutes,
> before the people listening to the streams pack up and leave:
>
> 16:53:28.079382 setsockopt(13, SOL_SOCKET, SO_LINGER, [1], 8) = 0
> 16:53:28.079638 time([949010008]) = 949010008
> 16:53:28.079861 write(1, "170.153.121.102 - - [27/Jan/2000:21:53:28 +0000] \"GET c1/use"..., 163) = 163
> 16:53:28.080184 close(13) = 0
> 17:03:07.471132 --- SIGHUP (Hangup) ---
> 17:03:07.472156 gettimeofday({949010587, 472268}, NULL) = 0
>
>
> Any idea what is happening here? Any kind of quick-fix I can do here:
> (I have a high-proofile event tomorrow, and naturally, I don't have the
> source to the silly application).. Maybe I can quickly get the kernel
> to ignore SO_LINGER either entirely (and I guess I'll find out what
> breaks), or for just particular processes.
>
>
> -SteveK
>
>
> --
> Steve Kann - Horizon Live Distance Learning - 841 Broadway, Suite 502
> P:stevek en SteveK.COM - B:stevek en HorizonLive.com - R:KC2FBU (212) 533-1775
> "The box said 'Requires Windows 95, NT, or better,' so I installed Linux."
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo en vger.rutgers.edu
> Please read the FAQ at http://www.tux.org/lkml/
>
--
Steve Kann - Horizon Live Distance Learning - 841 Broadway, Suite 502
P:stevek en SteveK.COM - B:stevek en HorizonLive.com - R:KC2FBU (212) 533-1775
"The box said 'Requires Windows 95, NT, or better,' so I installed Linux."
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo en vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/
Más información sobre la lista de distribución Ayuda