Scheduler & semaphore patch for 2.2.14 ...

Dom Ene 23 19:32:54 CST 2000

Hi,

Here are some numbers.  My (java) application is moving lots of data thru
a pipeline, where each stage of the pipeline is a thread.  The code for each
stage is small (1-20K) and the support class are about 20K each (plus netrexx 
and java).  This implies that most of code should be cachable.  Another point,
all the locks are encapsulated in the support class(s).  When using my stuff
you do not normally have to worry about locking. 

In the example run below I am looking to see just how fast data is moved in the 
pipeline.  Normally there will only be one thread that is runnable.  Suspect that
the goodness check in wakeup is helping me.  In any case I see slightly better
performance with Davide's patch applied.

The stats include the run of my test pipeline.  The numbers in it are in ms.  
I have also included vmstat output, in the vmstat you can see fewer runnable
processes and more context switches with the patch applied.

I would be very interested to see a SAP benchmark run with and without this
patch.

Ed Tomlinson <tomlins en cam.org>
http://www.cam.org/~tomlins/njpipes.html

Davide Libenzi wrote:
> 
> Hi guys,
> 
> here's the patch coded for 2.2.14.
> It contains :
> 
> * A clustering subdivision of processes in function of their goodness
>         This avoid the linear scan of the running queue to find the
>         next process to run.
>         My test prove that even in high workloaded environment
>         not more than two goodnesses are calculated.
> 
> * As IBM guys suggested I've reorganized a bit the fields in task_struct
>         to reduce the cache footprint
> 
> * A wake_up patch that has the goodness calculation inside it.
>         This avoid flushing all waiting threads onto the scheduler
>         shoulders.
>         IMVHO this is better than the current because avoid the thread
>         flushing, and is better then a simple FIFO coz it try to select the
>         best taks to run.
>         This is 100% precise in UP while it's not in SMP due to the fact
>         that we don't know which CPU will reschedule the task.
>         Anyway even if in SMP the precision is not 100%, it's better
>         than FIFO.
> 
> I don't know if this patch will be usefull or not, the thing the is true is that
> it reduces scheduling times on high loaded runqueues, and that can speed up
> process wakeup in long waitqueue.
> 
> I hope someone having a better suite than mine, should test the patch
> reporting me ( and others ) own impressions and measurements.
------------ próxima parte ------------
K6-3 400 192M (128dimm/64simm) 
Linux oscar 2.2.15pre2 #1 Sat Jan 22 20:42:18 EST 2000 i586 unknown 

2.2.14pre2

oscar% vmstat 2 20
   procs                      memory    swap          io     system         cpu
 r  b  w   swpd   free   buff  cache  si  so    bi    bo   in    cs  us  sy  id
 1  0  0   1764   5464  57680  70604   0   1     8     5  155  1176  64   2  34
 2  0  0   1764   4280  57680  70604   0   0     0     0  151   269  97   3   0
 4  0  0   1712   3148  57656  69732   0   0     0     0  186 61343  67  33   0
 2  0  0   1712   3148  57656  69732   0   0     0     1  137 88462  54  46   0
 3  0  0   1712   3148  57656  69732   0   0     0     0  129 88785  52  48   0
 3  0  0   1712   3004  57796  69732   0   0     0    35  120 88175  54  46   0
 6  0  0   1712   3132  57796  69604   0   0     0     0  102 85732  59  41   0
 2  0  0   1712   3100  57796  69604   0   0     0     0  102 87082  55  45   0
 3  0  0   1712   3088  57796  69604   0   0     0     1  103 84098  61  39   0
 3  0  0   1712   3072  57796  69604   0   0     0     0  102 79152  61  39   0
 6  0  0   1712   3192  57784  69488   0   0     0     0  102 80071  54  46   0
 2  0  0   1708   6868  57784  69484   0   0     0     0  217 63028  65  35   0
 2  0  0   1708   6100  57784  69484   0   0     0     0  117   116  99   1   0
 2  0  0   1708   6100  57784  69484   0   0     0     0  152   175  99   1   0

oscar% java bench
speed1
timer_3 79
timer_3 25 104 RC=0
speed2
timer_3 3
timer_3 9269 9272 RC=0
speed3
timer_3 4
timer_3 990 994 RC=0
timer_3 2
timer_3 965 967 RC=0
timer_3 2
timer_3 888 890 RC=0
timer_3 1
timer_3 1004 1005 RC=0
timer_3 1
timer_3 1050 1051 RC=0
timer_3 1
timer_3 994 995 RC=0
timer_3 1
timer_3 1037 1038 RC=0
timer_3 2
timer_3 992 994 RC=0
timer_3 2
timer_3 1044 1046 RC=0
timer_3 1
timer_3 1031 1032 RC=0
oscar% 

2.2.15pre2 with scheduler patch

   procs                      memory    swap          io     system         cpu
 r  b  w   swpd   free   buff  cache  si  so    bi    bo   in    cs  us  sy  id
 1  0  0      0  96176  11340  40068   0   0    34     8  197 10900  12  10  78
 0  0  0      0  96176  11340  40068   0   0     0     0  170   344   1   1  98
 1  0  0      0  92588  11340  40068   0   0     0     0  103 29184  48  19  33
 1  0  0      0  92588  11340  40068   0   0     0     0  102 93694  55  45   0
 1  0  0      0  92588  11340  40068   0   0     0     1  103 93444  50  50   0
 2  0  0      0  93040  11340  40068   0   0     0     0  102 93403  51  49   0
 3  0  0      0  93036  11340  40068   0   0     0     0  102 90483  55  45   0
 4  0  0      0  93016  11340  40068   0   0     0     6  108 83894  53  47   0
 1  0  0      0  93000  11340  40068   0   0     0     0  102 84280  58  42   0
 2  0  0      0  92988  11340  40068   0   0     0     0  102 84494  58  42   0
 3  0  0      0  92976  11340  40068   0   0     0     0  102 85225  54  46   0
 0  0  0      0  96588  11340  40068   0   0     0     0  102 62206  38  32  30
 0  0  0      0  96588  11340  40068   0   0     0     0  133   116   0   1  99
 0  0  0      0  96588  11340  40068   0   0     0     0  102   116   0   1  99

oscar% java bench
speed1
timer_3 65
timer_3 21 86 RC=0
speed2
timer_3 3
timer_3 8656 8659 RC=0
speed3
timer_3 5
timer_3 993 998 RC=0
timer_3 2
timer_3 978 980 RC=0
timer_3 1
timer_3 1018 1019 RC=0
timer_3 1
timer_3 891 892 RC=0
timer_3 1
timer_3 979 980 RC=0
timer_3 1
timer_3 958 959 RC=0
timer_3 1
timer_3 978 979 RC=0
timer_3 2
timer_3 945 947 RC=0
timer_3 1
timer_3 975 976 RC=0
timer_3 2
timer_3 906 908 RC=0