No writeup, but if you want to prepare to better discuss, perhaps one of the papers from my group might help with that:

  • Jinsuk Chung , Ikhwan Lee , Michael Sullivan , Jee Ho Ryoo , Dong Wan Kim , Doe Hyun Yoon , Larry Kaplan and Mattan Erez (2012) Containment Domains: A Scalable, Efficient, and Flexible Resilience Scheme for Exascale Systems. In {the Proceedings of SC12}. November. ((URL)) (BibTeX)