SMP Theory (was: Re: Interesting analysis of linux kernel threading by IBM)

Jeff V. Merkey jmerkey en timpanogas.com
Mie Ene 26 02:37:10 CST 2000


How do you do fault tolerance on Shared-Everything (like what you
describe)?  Sounds like COMA or SCI?  Most folks dismiss
shared-everything architectures as non-resilent (Intel and Microsoft
have traditionally been shared-nothing bigots on this point).

Jeff

Iain McClatchie wrote:
> 
> One of the problems with this forum is that you can't hear the murmur
> of assent ripple through the hardware design crowd when Larry rants
> about this stuff.  Larry has had his head out of the box for a long
> time.
> 
> Look at the ASCI project.  The intention was for SGI to build an
> Origin with around 1000 CPUs.  That Origin had extra cache coherence
> directory RAM and special encodings in that RAM so that the hardware
> could actually keep the memory across all 1000 CPUs coherent.  We
> added extra physical address bits to the R10K to make this machine
> possible.
> 
> Last I heard, the machine is mostly programmed with message passing.
> 
> I remember having a talk with an O/S guy who was implementing some
> sort of message delivery utility inside the O/S.  This was when
> Cellular IRIX was in development, and they were investigating having
> the various O/S images talk to each other with messages across the
> shared memory.  Then someone found out the O/S images could signal
> each other FASTER through the HIPPI connections than they could
> through shared memory.  That is, this machine had a HIPPI port local
> to each O/S image, and all those HIPPI ports were connected together
> via a HIPPI switch.
> 
> Those HIPPI connections were build with the _same_physical_link_ as
> the shared memory - an 800 MB/s source-synchronous channel.  But if
> you're sending a message, it's better to have the I/O system just
> send the bits one way than have the shared memory system do two round
> trips, one to invalidate the mailbox buffer for writing and another to
> process the remote cache miss to receive the message.
> 
> -Iain McClatchie
> www.10xinc.com
> iain en 10xinc.com
> 650-364-0520 voice
> 650-364-0530 FAX
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo en vger.rutgers.edu
> Please read the FAQ at http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo en vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



Más información sobre la lista de distribución Ayuda