Once edited this page will contain a nice introduction and information on the containment domains concept for resilience. For now, please refer to the early technical report below, a research prototype API specification, and a poster. Please stay tuned for a prettier presentation, more information, and real publications.
- Current early prototype for MPI available at https://bitbucket.org/cdresilience/cdruntime
- An initial prototype focusing on CD-preservation and provisioning and developed by Cray is available at http://craycontainment.sourceforge.net. Note that this prototype does not confirm to the most current semantics.
Demo videos available here.
This research started as a collaboration between The University of Texas at Austin, Cray, and NVIDIA under the DARPA UHPC sponsored Echelon project. Research on CDs is currently supported by the DOE X-Stack DEGAS project, a DOE Early Career Research Award, and the NVIDIA-led DOE FastForward project.
- CD Team. Containment Domains C++ API v0.1. http://lph.ece.utexas.edu/users/CDAPI, March, 2014. (PDF) (BibTeX)
- Michael Sullivan, Ikhwan Lee, Jinsuk Chung, Kyushick Lee, Song Zhang, Seong-Lyong Gong, Derong Liu, Michael LeBeane, and Mattan Erez. Containment Domains Semantics version 0.2. Technical report Tr-LPH-2014–001, LPH Group, Department of Electrical and Computer Engineering, The University of Texas at Austin, February, 2014. (PDF) (BibTeX)
- Jinsuk Chung, Ikhwan Lee, Michael Sullivan, Jee Ho Ryoo, Dong Wan Kim, Doe Hyun Yoon, Larry Kaplan, and Mattan Erez. Containment Domains: A Scalable, Efficient, and Flexible Resilience Scheme for Exascale Systems. In the Proceedings of SC’12. November, 2012. (PDF) (BibTeX)
- Cray Inc.. Containment Domains API. lph.ece.utexas.edu/public/CDs, April, 2012. (PDF) (BibTeX)
- CD poster presented at the 2012 Salishan Conference on High-Performance Computing.
- Michael Sullivan, Doe Hyun Yoon, and Mattan Erez. Containment Domains: A Full-System Approach to Computational Resiliency. Technical report TR-LPH-2011–001, LPH Group, Department of Electrical and Computer Engineering, The University of Texas at Austin, January, 2011. (PDF) (BibTeX)