Auto-Adaptive scheduler - Final chapter ( the numbers ) ...
Richard Gooch
rgooch en ras.ucalgary.ca
Vie Ene 28 07:35:02 CST 2000
Larry McVoy writes:
> : The RQ = 2 case give me :
> :
> : Old = 759000 switches / sec = 1.317 us
> : New = 735000 switches / sec = 1.360 us
>
> I certainly hope that noone would use this as a basis for accepting or
> rejecting this patch. For a number of reasons:
>
> . this shows a 3% difference. My experience with linux and context
> switching is that the lack of page coloring can cause different
> runs of the same test to vary more than this, so these numbers
> may be right and then again, they may not. It's pretty hard to
> know for sure because you don't know how the OS placed the pages.
> The page placement can make all the difference in whether you
> collide or fit in the cache.
Tell me about it. Having page colouring would be very nice. However,
it would slow down memory allocation.
Larry: what's your position on whether Linux should have a coloured
page allocator?
> I'm sure many of you think I'm a raving lunatic to care about one
> stinking cache miss. I'm sorry, but the only way you prevent an OS
> from becoming bloated is to care about each and every cache miss,
> each and every cache line, and actually weigh the cost vs. the
> benefit.
Nah. Getting cache misses is like death by a thousand cuts.
Unfortunately, reducing cache misses can be very hard with strange
subtleties around every corner. If we had a coloured page allocator,
it would be a lot easier to get deterministic results, which in turn
would make tuning easier. When I was doing the RT run queue work, I
experimented with re-ordering bits of struct task_struct to reduce
cache line usage for the scheduler inner loop. It wasn't always a
win. I got a few percent speedup, IIRC, but there was one case with a
fraction of a percent loss. Bugger :-(
One day I plan on shoe-horning in a coloured page allocator and
playing with this again, and use the PMC patch to figure out what went
wrong with the re-ordering.
> In this case, I certainly don't see these numbers as conclusive, for
> all I know, the new code could be faster rather than slower - we
> need to look at the cycle counters to find out. If Richard is
> reading this, I'll bet he can tell us how to do that, I think he did
> a patch for that.
Yeah, I'm reading it (barely, I've been deleting most of my email
because I've been on holidays). The PMC (Performance Monitoring
Counter) patch I wrote is available at:
http://www.atnf.csiro.au/~rgooch/linux/
It's a little stale, since I've been too busy to maintain it, but it
did work and should still work (with little or no porting effort: I
got a success report from someone the other day). When I used it last,
it told me Linux needs a coloured page allocator :-) (which is also
available from that URL, but is pretty brute-force). I will be getting
back in the saddle with the PMC patch, though. It doesn't support
virtual (per process) PMCs yet (someone else maintains a patch that
does), but I'll probably add that too. I just need to sit down and
think about how to do it cleanly and minimise the impact on the
scheduler.
Regards,
Richard....
Old: rgooch en atnf.csiro.au
Current: rgooch en ras.ucalgary.ca
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo en vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/
Más información sobre la lista de distribución Ayuda