[PATCH] 2.3.40: <linux/linkage.h> generates incorrect cache alignments

Vie Ene 28 05:04:16 CST 2000

Werner,

that was pretty quick analysis.  I think that what you were saying
is that 3/5% of the kernel footprint is duplicated strings!?  Wow.
I certainly wouldn't chase it down, but it is interesting.

Actually, I should have gone a little further with my example.
The gratuitous alignments show up in basic blocks of code
as well.

                ...
                orl %edx,(%ebp)
                jmp .L1482
                .p2align 4,,7
        .L1470:
                notl %edx
                andl %edx,(%ebp)
        .L1482:
                addl $4,%ebp
                addl $-32,%ebx
        .L1468:
                xorl %edx,%edx
                ...

and

                .size    sys_ioperm,.Lfe2-sys_ioperm
                .align 4
        .globl sys_iopl
                .type    sys_iopl, en function
        sys_iopl:
                pushl %ebx
                movl 8(%esp),%ecx

This is probably the entry to a loop.  The .p2align 4,,7 says,
if I can trust the documentation, to pad modulo 16,
with an upper bound of 7 bytes.   Again, I didn't ask
the compiler for this special service and it seems to be
beyond my control.

What I like is that gcc aligns for function entry points.
You being called from a random place, have the cache
load 32 bytes of the new instruction stream.  But function
alignments are unfortunately modulo *4* rather
than to a cache line.  .align takes the number of bytes
and .p2align takes log2 the number of bytes.
Well now, that's a good idea.

What I really don't like is alignment for basic blocks,
which strikes me as misguided.  What is the point?
It just wastes cache footprint.

Chris Sears
cbsears en ix.netcom

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo en vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/