Interesting analysis of linux kernel threading by IBM

Lun Ene 24 06:59:10 CST 2000

I apologize to all on the list that I did not clarify things enough. Larry
asked for examples of people who had larger then 10 running processes, I
thought that it would be self evident that the bottleneck in my system was
caused by CUP load given the details that I included in the report. It
appears that several people did not understand that (or did not understand
that I understand that :-)

So far the only reports that I have seen of systems that normally sit
above 10 running processes have been systems with known problems (in my
case CPU overhead of SSL connections, in some others it was problems
caused by poor CGIs) This should help make the point that if you have that
many processes runnable you have other severe problems and so changing the
scheduler is not the fix.

Even so there have been enough debates about the scheduler (I am thinking
of the sched_idle stuff from last year) that it may be a good idea to
provide the compile time hooks to change schedulers and make it easy for
people to experiment with different ones

David Lang

 On 21 Jan 2000 torvalds en www.transmeta.com wrote:

> In article <Pine.LNX.4.21.0001201818410.148-100000 en dlang.diginsite.com>,
> David Lang  <dlang en diginsite.com> wrote:
> >Shortly before I went and purchased $10,000 encryption co-processors for
> >my SSL web servers it was not unusual to see a 5 min loadave of 30-50 (and
> >one time I saw it go up to 144!!). This was on AIX quad 233 2G ram RS/6000
> >servers, after I got the encryption co-processors the loadave dropped to
> >5-10
> 
> Ok, before this gets out of hand, let me just clarify:
> 
>  - under loads like the above, scheduling speeds MEANS ABSOLUTELY
>    NOTHING.
> 
> Nada. Zilch. Zero.
> 
> You aren't spending any time scheduling - you're spending all the time
> COMPUTING.  The high load is because you're compute-bound, and you
> probably end up scheduling maybe a few hundred times a second. If that.
> 
> With a default timeslice in the 10-200 millisecond range (Linux defaults
> to 200ms, and that's probably too dang long), compute-active processes
> aren't really much an issue. 
> 
> Basically, the situation where you have BOTH
>  - large number of runnable processes
> AND
>  - lots of scheduling activity
> 
> are very rare indeed.  They are rare even in threaded code, unless that
> threaded code has a lot of synchronization points and a lot of
> synchronous inter-thread communication. 
> 
> The IBM numbers are very interesting. The cache-line optimization is an
> obvious performance advantage, and has been incorporated into the recent
> kernels. But I think some people think that this is a common problem,
> and think that "load average" automatically equals scheduling. It isn't,
> and it doesn't.
> 
> 		Linus
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo en vger.rutgers.edu
> Please read the FAQ at http://www.tux.org/lkml/
> 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo en vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/