kernel panics at google

Alan Cox alan en lxorguk.ukuu.org.uk
Vie Ene 21 05:02:13 CST 2000


> If I'm interpreting the kernel oops correctly, the error occurs in
> tcp_retrans_try_collapse.  I've attached below the oops, after
> attempting to run it through ksymoops.  (I think I got the right
> symbols, but I'm not absolutely certain.)

It seems to , although the error appears to have occurred at some point
before that, and tcp_retrans_try_collapse gets handed a buffer which appears
to either have been freed, or not to be on a list (skb->next == NULL).


> We could really use informed suggestions about what could be triggering
> this, or how we might try to work around it.  Thanks very much in
> advance for any help.

Workaround suggestion: echo "0" >/proc/sys/net/ipv4/tcp_retrans_collapse

> >>EIP: c0158892 <tcp_retrans_try_collapse+3a/208>
> Trace: c01462cb <__kfree_skb+97/a0>
> Trace: c0158be7 <tcp_retransmit_skb+a3/164>
> Trace: c0158d0d <tcp_xmit_retransmit_queue+65/e4>

This seems to to be the interesting area and looks suspicious:

Dave: Side item tcp_retrans_try_collapse doesnt seem to care about collapsing
in SACK'd data - is that right ?  

Second question:  tcp_xmit_retransmit_queue is entered with tcp->retrans_head
pointing part way into the retransmit queue if we baled part way through
the previous retransmit run (eg cos cwnd got us). We guarantees that
tp->retrans_head is still pointing to a valid packet so right now I
cant see a cause.

Do all the crashes basically look identical ?

Alan


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo en vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



Más información sobre la lista de distribución Ayuda