Auto-Adaptive scheduler - Final chapter ( the numbers ) ...

Richard Gooch rgooch en ras.ucalgary.ca
Vie Ene 28 07:35:02 CST 2000


Larry McVoy writes:
> : The RQ = 2 case give me :
> : 
> : Old = 759000 switches / sec = 1.317 us
> : New = 735000 switches / sec = 1.360 us
> 
> I certainly hope that noone would use this as a basis for accepting or
> rejecting this patch.  For a number of reasons:
> 
>     . this shows a 3% difference.  My experience with linux and context
>       switching is that the lack of page coloring can cause different
>       runs of the same test to vary more than this, so these numbers 
>       may be right and then again, they may not.  It's pretty hard to
>       know for sure because you don't know how the OS placed the pages.
>       The page placement can make all the difference in whether you
>       collide or fit in the cache.

Tell me about it. Having page colouring would be very nice. However,
it would slow down memory allocation.
Larry: what's your position on whether Linux should have a coloured
page allocator?

> I'm sure many of you think I'm a raving lunatic to care about one
> stinking cache miss.  I'm sorry, but the only way you prevent an OS
> from becoming bloated is to care about each and every cache miss,
> each and every cache line, and actually weigh the cost vs. the
> benefit.

Nah. Getting cache misses is like death by a thousand cuts.
Unfortunately, reducing cache misses can be very hard with strange
subtleties around every corner. If we had a coloured page allocator,
it would be a lot easier to get deterministic results, which in turn
would make tuning easier. When I was doing the RT run queue work, I
experimented with re-ordering bits of struct task_struct to reduce
cache line usage for the scheduler inner loop. It wasn't always a
win. I got a few percent speedup, IIRC, but there was one case with a
fraction of a percent loss. Bugger :-(

One day I plan on shoe-horning in a coloured page allocator and
playing with this again, and use the PMC patch to figure out what went
wrong with the re-ordering.

> In this case, I certainly don't see these numbers as conclusive, for
> all I know, the new code could be faster rather than slower - we
> need to look at the cycle counters to find out.  If Richard is
> reading this, I'll bet he can tell us how to do that, I think he did
> a patch for that.

Yeah, I'm reading it (barely, I've been deleting most of my email
because I've been on holidays). The PMC (Performance Monitoring
Counter) patch I wrote is available at:
http://www.atnf.csiro.au/~rgooch/linux/

It's a little stale, since I've been too busy to maintain it, but it
did work and should still work (with little or no porting effort: I
got a success report from someone the other day). When I used it last,
it told me Linux needs a coloured page allocator :-) (which is also
available from that URL, but is pretty brute-force). I will be getting
back in the saddle with the PMC patch, though. It doesn't support
virtual (per process) PMCs yet (someone else maintains a patch that
does), but I'll probably add that too. I just need to sit down and
think about how to do it cleanly and minimise the impact on the
scheduler.

				Regards,

					Richard....
Old:     rgooch en atnf.csiro.au
Current: rgooch en ras.ucalgary.ca

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo en vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



Más información sobre la lista de distribución Ayuda