anyone measured context-switch cost on Linux/ia32?
Mikael Pettersson
mikpe en csd.uu.se
Jue Ene 20 16:02:52 CST 2000
Has anyone measured the cost (in cycles or time) of the
context-switch path in Linux/ia32?
The reason I'm asking is this comment in Ingo's latest
SMP patch announcement:
> - optimized TLB flushes a bit, we actually do not have to read %cr3 and
> %cr4 (which is a slow instruction), we can calculate all the data.
According to my measurements, Pentium classic and MMX need 4 cycles
to read and 18 cycles to write %cr4, while Pentium II/III need 2
cycles to read and 42 cycles to write %cr4.
%cr4 contains the CR4.PCE flag which controls user-mode use of
the RDPMC instruction. Therefore, my performance-monitoring counters
driver reads and writes %cr4 to toggle CR4.PCE each time a process
using virtual per-process performance counters is suspended or
resumed. On a Pentium II/III, this amounts to 44 cycles added to
the suspend and resume paths.
Now, the question is: are these 44 cycles insignificant or
enough to hurt?
If the latter, I can see two options for my driver:
- Have CR4.PCE globally enabled, allowing all processes to
read the performance counters. This is my preferred solution,
but I can imagine some security-by-limiting-information-leakage-
no-matter-how-unimportant people disagreeing.
- Don't toggle CR4.PCE, user-space always has to copy the counter
values via a system call. This is going to hurt since user-space
can sample the counters with zero system calls in my current
architecture, which is important when measuring short pieces
of code.
/Mikael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo en vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/
Más información sobre la lista de distribución Ayuda