anyone measured context-switch cost on Linux/ia32?

Mikael Pettersson mikpe en csd.uu.se
Jue Ene 20 16:02:52 CST 2000


Has anyone measured the cost (in cycles or time) of the
context-switch path in Linux/ia32?

The reason I'm asking is this comment in Ingo's latest
SMP patch announcement:

> - optimized TLB flushes a bit, we actually do not have to read %cr3 and
>   %cr4 (which is a slow instruction), we can calculate all the data.

According to my measurements, Pentium classic and MMX need 4 cycles
to read and 18 cycles to write %cr4, while Pentium II/III need 2
cycles to read and 42 cycles to write %cr4.

%cr4 contains the CR4.PCE flag which controls user-mode use of
the RDPMC instruction. Therefore, my performance-monitoring counters
driver reads and writes %cr4 to toggle CR4.PCE each time a process
using virtual per-process performance counters is suspended or
resumed. On a Pentium II/III, this amounts to 44 cycles added to
the suspend and resume paths.

Now, the question is: are these 44 cycles insignificant or
enough to hurt?

If the latter, I can see two options for my driver:

- Have CR4.PCE globally enabled, allowing all processes to
  read the performance counters. This is my preferred solution,
  but I can imagine some security-by-limiting-information-leakage-
  no-matter-how-unimportant people disagreeing.
- Don't toggle CR4.PCE, user-space always has to copy the counter
  values via a system call. This is going to hurt since user-space
  can sample the counters with zero system calls in my current
  architecture, which is important when measuring short pieces
  of code.

/Mikael

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo en vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



Más información sobre la lista de distribución Ayuda