On optimising the scheduler for large run queues
Marco Colombo
marco en esi.it
Lun Ene 31 11:31:13 CST 2000
On Sat, 29 Jan 2000, Jamie Lokier wrote:
> I will summarise.
>
> Heavyweight kernel developers (no that doesn't mean >40 years old :-)
> believe that, for all well designed real applications, scheduler
> overhead is dominated by cache overhead. Therefore optimising cache
> overhead takes priority.
>
> I agree. Though I personally do not have figures to support it, I have
> the impression that others do.
I can't call myself a "kernel developer", and I don't have figures.
But i believe it anyway [someone told he measured it, and also gave
good explanations].
[...]
> Multi-threaded I/O can be reduced to select() I/O.
>
> This isn't true for Java at the application level. You might call it a
> flaw in Java itself, and it probably is for now[1]. But there isn't a
> better alternative yet. Rewriting Java "business objects" in C is not
> plausible. User space threading is often a better idea in Java. If
^^^^^^^^^
That's a Java problem. Not a Linux one.
> they worked perfectly run queues would not be an issue, but user space
> threads introduce their own overheads.[2] Maybe dynamic compilers
> should try dynamically optimising thread switches outside the kernel.
>
> Real applications that are fully optimised for performance do not have
> large run queues.
>
> Apparently this is true.
They have many "good" properties. A good cache behaviour is one of the
most important. Avoiding unnecessary (kernel) schedules is another.
Both are difficult to obtain.
> Therefore optimising the scheduler for large run queues, at a cost for
> small run queues (in maintenance, footprint and overhead) is
> counterproductive for the most critical cases.
>
> Here is a logical reasoning error.[3] By the kernel heavyweights.
>
> The overhead of a scheduler changes is *only* relevant to high switching
> rate applications. And that doesn't include purely select() based
> single-threaded monolithic servers.
>
> No-one has shown any real, well designed, well tuned, short run queue
> applications that have a high switching rate!
>
> They certainly haven't used such examples in their arguments! They've
> used other examples. That's why it's a reasoning error: Those examples
> aren't relevant!
Good point. It's true for me, at last.
> Now, I expect there are examples of real, well designed, well tuned
> applications that switch very often.
>
> Until someone demonstrates that *those* applications have small run
> queues, and only those, then we have to consider the large run queue
> patches seriously. Remember the other applications, including
I'm sorry, Jamie, but this is *your* reasoning error. You should
demonstrate that real, well designed, well tuned applications
that switch very often (provided they exist: they lack one of the
"good" properties, low swiching rate, so they may be not "so well
designed and well tuned") also have long RQs.
In other words, I won't call an application that has both high
switch rate and causes a long RQ "well designed, well tuned".
I can hardly think of such an application which has a good cache
behaviour at the same time (that's my impression).
So I think that an application that has both high switch rate and
long RQ is NOT "well designed, well tuned", and you should optimize it.
[...]
> Dear experienced kernel developers, please give examples of real world
> application, properly written for performance (as you like), that would
> be adversely affected by the proposed scheduler changes.
That's a good point.
But:
you failed to show any real world application, properly written for
performance (as we ALL like), that would take advantage of the proposed
scheduler change.
If you show me a bad written one, I can do the same. B-)
Unless someone more experienced than me has something to say about it,
the point now is:
- we can't show any good application that will take advantage of the patch;
- we can't show any good application that will be adversely affected by it.
So, right now, IMHO there are no reasons to consider the patch...
[ this leaves out >16 CPUs systems... it should be considered for them,
if numbers show some real improvement for RQ around 20 ]
TM.
--
____/ ____/ /
/ / / Marco Colombo
___/ ___ / / Technical Manager
/ / / ESI s.r.l.
_____/ _____/ _/ Colombo en ESI.it
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo en vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/
Más información sobre la lista de distribución Ayuda