[PATCH] 2.3.40: <linux/linkage.h> generates incorrect cache alignments
Chris Sears
cbsears en ix.netcom.com
Vie Ene 28 05:04:16 CST 2000
Werner,
that was pretty quick analysis. I think that what you were saying
is that 3/5% of the kernel footprint is duplicated strings!? Wow.
I certainly wouldn't chase it down, but it is interesting.
Actually, I should have gone a little further with my example.
The gratuitous alignments show up in basic blocks of code
as well.
...
orl %edx,(%ebp)
jmp .L1482
.p2align 4,,7
.L1470:
notl %edx
andl %edx,(%ebp)
.L1482:
addl $4,%ebp
addl $-32,%ebx
.L1468:
xorl %edx,%edx
...
and
.size sys_ioperm,.Lfe2-sys_ioperm
.align 4
.globl sys_iopl
.type sys_iopl, en function
sys_iopl:
pushl %ebx
movl 8(%esp),%ecx
This is probably the entry to a loop. The .p2align 4,,7 says,
if I can trust the documentation, to pad modulo 16,
with an upper bound of 7 bytes. Again, I didn't ask
the compiler for this special service and it seems to be
beyond my control.
What I like is that gcc aligns for function entry points.
You being called from a random place, have the cache
load 32 bytes of the new instruction stream. But function
alignments are unfortunately modulo *4* rather
than to a cache line. .align takes the number of bytes
and .p2align takes log2 the number of bytes.
Well now, that's a good idea.
What I really don't like is alignment for basic blocks,
which strikes me as misguided. What is the point?
It just wastes cache footprint.
Chris Sears
cbsears en ix.netcom
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo en vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/
Más información sobre la lista de distribución Ayuda