Before each class, submit answers to these questions on Gradescope. I expect a few bullet points for each answer as preparation for discussion. I don’t want to see long paragraphs!
Lecture | Date | Topic (notes) | Reading | Comments |
---|---|---|---|---|
1 | 8/30 | Introduction and Policies | ||
2 | 9/4 | GPP binary compatibility | J. Denhert et al., “The Transmeta Code Morphing™ Software: using speculation, recovery, and adaptive retranslation to address real-life challenges,” CGO’03 | |
3 | 9/6 | DNN inference | S. Han et al., “EIE: Efficient Inference Engine on Compressed Deep Neural Network (EZProxy), ISCA’16 | |
4 | 9/11 | Reliability and Power | D. Ernst et al., “Razor: A Low-Power Pipeline Based on Circuit-Level Timing Speculation”, MICRO-36, 2003 A related white paper (I suggest you read it), and a Short Explanation on Metastability | |
5 | 9/13 | NVM Technology | C. Xu et al., “Overcoming the challenges of crossbar resistive memory architectures (EZProxy), HPCA, 2015 | |
6 | 9/20 | Slack | Catching up on discussions | |
7 | 9/18 | Memory management (OS) | Y. Kwon et al., “Coordinated and Efficient Huge Page Management with Ingens”, OSDI 2016 | |
8 + 9 | 9/25 | Capabilities and security | J. Woodruff et al., “The CHERI Capability Model: Revisiting RISC in an Age of Risk”, ISCA 2014 (EZProxy) | Background |
10 | 10/2 | Side channels | C. Hunger et al., “Understanding Contention-Based Channels and Using Them for Defense”, HPCA 2015 (EZProxy) | |
11 | 10/4 | Fine-grained threads | D. Culler et al., “Fine-Grain Parralelism with Minimal Hardware Support: A Compiler-Controlled Threaded Abstract Machine,” ASPLOS 1991 (EZProxy) | |
12 | 10/9 | HW Active Messages]] | M. Noakes et al., “The J-Machine Multicomputer: An Architecural Evaluation”, ISCA 20, 1993. PLEASE ALSO READ THIS OVERVIEW WITH PICTURES: W. J. Dally et al., “The J-Machine: A Retrospective”, 1998 (EZProxy). | |
13 | 10/11 | Slack | Discussion continues, but read: K. Fatahalian and M. Houston, “GPUs: a Closer Look”, ACM Queue 6(2), 2008 (EZProxy) | |
14 | 10/16 | GPUs I | W. Fung and T. Aamodt, “Thread Block Compaction for Efficient SIMT Control Flow”, HPCA, 2011 (EZProxy) ‘’‘ALSO HIGHLY RECOMMENDED: V. Narasiman et al., “Improving GPU performance via large warps and two-level warp scheduling”, MICRO 2011 (EZProxy) | |
15 | 10/18 | GPUs II | D. Merril et al., “Scalable GPU Graph Traversal”, PPoPP 2012 (EZProxy) | |
16 | 10/23 | Exam 1 | ||
17 | 10/25 | Projects | Project breakouts | |
18 | 10/30 | GPU→Dataflow | D. Voitsechov and Y. Etsion, “Single-graph multiple flows: energy efficient design alternative for GPGPUs”, ISCA 2014 (EZProxy) | |
19 | 11/1 | Continued | ||
20 | 11/6 | Datacenters 1 | H. Zhu and M. Erez, “Dirigent: Enforcing QoS for latency-critical tasks on shared multicore systems”, ASPLOS 2016 (EZProxy) | |
21 | 11/8 | QoS (2) | Y. Zhou and D. Wentzlaff, “MITTS: memory inter-arrival time traffic shaping”, ISCA 2016 (EZProxy) | |
22 | 11/13 | Compression | E. Choukse et al., “Compresso: Pragmatic Main Memory Compression”, MICRO 2018 | |
23 | 11/14 | HPC | A. Magni et al., “A large-scale cross-architecture evaluation of thread-coarsening”, SC13, 2013 (EZProxy) | |
24 | 11/20 | Hardware DSL | D. Koeplinger etl al., “Spatial: a language and compiler for application accelerators”, PLDI 2018 (EZProxy) | |
25 | 11/27 | Spectre/Meltdown | C. Canella et al., “A Systematic Evaluation of Transient Execution Attacks and Defenses”, arXiv Preprint 2018. | |
26 | 11/29 | Exam 2 | Take home exam | |
27 | 12/4 | Transactional Memory | L. Hammond et al., “Transactional Memory Coherence and Consistency”, ISCA 2004 (EZProxy) |