Announcements :

  • Lab1 is up. Due dates in the lab description.
  • Lab2 is up. Due dates in the lab description.
  • Lab3 is now fully up (Part1, Part2 and Part3).



EE382N-20 — Computer Architecture: Parallelism and Locality
Spring 2019
Tentative Course Descriptor and Syllabus

Vital Information

Class will meet on Mondays and Wednesdays 1:30 - 3:00pm in EER 1.518.

Please check this page and Piazza for updates, reading material, announcements, and assignments.

How to communicate with the instructors:
For general questions and clarifications please use Piazza whenever possible. There are a few exceptions to this:

  • If you have a question about a specific solution that may “infect” other groups and block their creativity, then use post privately on Piazza, otherwise please post publicly (anonymously if you wish).
  • If you have a personal questions then direct it to a specific instructor.


Instructor

Professor:
Mattan Erez
EER 5.872
mattan.erez@utexas.edu
471–7846
Office Hours: M 3 - 4, Tu 2 - 3, and by appointment

Teaching Assistants:
We have two 1/2 TAs so be aware of their time limitations
Yongkee Kwon
yongkee.kwon@utexas.edu
Office Hours: F 4 - 7 and by appointment

Kyushick Lee
kyushick@utexas.edu
Office Hours: W 10:30am - 1:30pm

Description

Two major challenges facing computer architects today are dealing with tight power budgets and achieving high performance as off-chip bandwidth diminishes in comparison with available on-chip compute resources. In this course we will explore how the fundamental properties of locality and parallelism can be utilized in both hardware and software to overcome these challenges of power and bandwidth constraints. We will develop hardware cost models and hardware and software techniques through a combination of structured lectures, paper reading, discussions, homework assignments, programming labs, and a collaborative project. Examples of architectures and methods that will be covered include traditional general-purpose processors, massively parallel processors, parallel memory systems, parallel programming and execution models, shared memory systems, distributed shared memory systems, domain decomposition techniques, and cache-aware and cache-oblivious algorithms (tentative syllabus below).


Goals

  • Understand how the fundamental concepts of locality and parallelism affect both performance and power dissipation in modern computer systems.
  • Learn how to use tools, techniques, and models that account for locality and parallelism at the same time and learn some basic programming and optimization techniques for such systems.


Prerequisites

This is an advanced course in computer architecture. We will be developing most of the material during class or through provided reading. However, a solid base in computer architecture is a must, and familiarity with parallelism and concurrency will be very valuable.

Students must know the principles of computer architecture. You should have done well in EE460N or an equivalent class. What I have been finding talking to students is that many have not taken a 460N equivalent class; it’s really much more similar to a graduate class in computer architecture than an undergraduate class.

Students are strongly encouraged to have some understanding of parallel architectures or concurrency and synchronization i.e., EE382N-10, CS372, EE445M or EE360P or equivalent. Other knowledge that is helpful includes algorithms (e.g., EE360C) and compilers (CS375).

More about prereqs and expectations here?.


Required Reading

Reading material will be selected from leading conferences, journals, and magazines including ASPLOS, ISCA, MICRO, SC, as well as active research projects. All required material will be made available on the course web page.


Recommended Reading

There is no required textbook for this class, however, you may find the following useful:

  • Timothy G. Mattson, Beverly A. Sanders, Berna L. Massingill, “Patterns for parallel programming”, 2005, Addison-Wesley Boston.
  • David B. Kirk and Wen-Mei Hwu, “Programming Massively Parallel Processors: A Hands-on Approach”, Morgan Kaufmann.
  • Additional recommended websites and other open material soon.


Class Format

The class will meet twice a week. The class meetings will be a combination of lectures and class discussions. About 15% (most likely) of the class meetings will be an open discussion based on pre-assigned reading material. As class preparation for the discussions, before each class, all students must read the assigned material (equivalent to at most 1 - 2 research papers) and prepare a short writeup of discussion points (in groups of 3 students). During class, I will present the material and lead a discussion that places the material in the broader context of the class. In addition to the occasional short write-up required to vitalize class discussion, the class will include up to 5 collaborative homework assignments and programming labs (not 5 each) as well as a final collaborative project in lieu of a final examination (final examination time slot may be used for mandatory project presentations). Your final grade will also depend on my subjective evaluation of your performance based largely on your contribution to class discussions. Please see below for detailed class policies.


Class Policies

  • Class attendance and participation: Students are required to attend all classes, as well as prepare and actively participate in class discussions. If you know you will not be able to attend a class or missed a class for a good reason please contact me. This is a graduate class and discussion is critical.
  • Grading and auditing: This class should be taken for a grade and not the credit/no-credit option. If this is a serious issue for you, please see the me. Auditing the class is generally fine, as long as you take it seriously and are well prepared for the discussion on those classes you choose to attend. If you are interested in auditing the class instead of registering, please let me know.
  • Progress Updates: Each student is expected to give me an update on his or her progress during the first 2 weeks of the semester and once more during the last 2 weeks of the semester. These updates can be either in the form of a report or as a short individual meeting. Additionally, each group collaborating on a project must consult with the me at least once per week once project start (again, by providing written progress reports or conversations in meetings).
  • Collaboration: The homework and programming assignments, as well as the final project must be collaborative in nature and require teams of 3 students. Groups will be formed based on a survey to help form high-quality innovative groups to the best of my abilities. Turns out that, overall, groups that are purposefully formed to optimize the class overall are far more effective than student-formed teams. Each group should submit a single writeup. For a more fair evaluation, each student is expected to hand in a sincere evaluation of each team member’s contribution. For the final project, discussions outside the group are also permitted, however, plagiarism will not be tolerated (see below for academic dishonest policy). An important note: putting your name on an assignment you did not substantially contribute to is cheating!
  • Assignments and Labs (39%): To improve understanding and intuition, the class will include up to five collaborative assignments and programming labs. The, most likely three, labs will deal with programming massively-parallel processors (e.g., GPUs) and more traditional parallel systems as well. The assignments and labs will not be graded competitively against one another — I expect you to do well and learn from them. While these assignments will be structured, there will be some aspects that are vague and open ended. You should be prepared to embrace this. For example, your goal might be “to optimize” but there will not be a specific set target. Expect the labs to take significant time and effort.
  • Final project (writeup — 26% of final grade; class presentation — 5% of final grade): The final project must be a collaboration of 3 students and individual work will not be accepted. Please see above for group guidelines and contact me in case of difficulties. The project will be used as a teaching tool to enhance your understanding of the material. The project will be open-ended. I will help you focus your ideas and hone your intuition and skills. Think of the project as an attempt to write a research paper on a topic that is covered in, or related to, the class and may address hardware and/or software issues. It is not expected that you fully achieve this goal of publishable research, but you are required to be able to describe your idea in depth, place it in the context of the class and of prior related work, phrase hypotheses on how implementing your idea will improve and affect the system, and design an evaluation that will test the hypotheses. I will do my best to help you focus and refine your ideas and to guide you towards achieving significant research if you are interested and motivated to do so. The end result of the project should be a written report that roughly follows the style and length of a typical architecture conference paper, an accompanying detailed presentation, and a shorter presentation that will be given to the entire class. You are free to choose your own project topic, and we will also provide some ideas to get you started. The projects will be graded competitively against each other. What this means, is that if you set out to do a narrow-scoped project and accomplish it you may get a lower grade than a team that had attempted a more challenging task. I want to encourage you to think big and use us as a resource to guide you in accomplishing challenging tasks with reasonable effort. A major goal of the project is to teach me something new :-) In past semesters I was sometimes lenient with grading projects. That will not be the case this semester. Time for me to grow up and be tougher.
  • ‘’‘Exams (10% of final grade): There will be likely one in-class exam.
  • Other (20% of final grade): I will evaluate each student based largely on class participation, contribution to the class in other ways, and individual/group updates.
  • Feedback: Candid and frank (anonymous) feedback and criticism regarding the choice of topics, pace, workload, etc …, will be greatly appreciated by me and will positively impact the class.
  • Final examination: There will not be a final written examination in this class, however, the final exam time slot may be used for project presentations.
  • Policy summary:
Component% of final grade
Attendance and participationPart of “other” 20%
Progress UpdatesPart of “other” 20%
Assignments and Labs39%
Quiz10%
Final project26% (writeup) +
5% (class presentation)
Feedback0%


Academic Dishonesty Policy

Plagiarism or any form of academic dishonesty (cheating includes, but is not limited to, copying another person’s work, copying material directly from a book, article, or web site without including appropriate references, falsifying data, doing someone’s work) is a violation of University rules and may result in failing the class or may incur even steeper penalties. For University policies and our honor code see: http://deanofstudents.utexas.edu/sjs/acint_student.php and http://deanofstudents.utexas.edu/sjs/spot_honorcode.php).


College of Engineering Drop/Add Policy

The Dean must approve adding or dropping courses after the fourth class day of the semester.


Students with Disabilities

The University of Texas at Austin provides upon request appropriate academic accommodations for qualified students with disabilities. For more information, contact the Office of the Dean of Students at 471–6259, 471–4641 TTY (http://diversity.utexas.edu/disability/) or the College of Engineering Director of Students with Disabilities at 471–4382.


Religious Holidays

Please contact the instructor with any issues related to religious holidays. We will be quite accommodating.


Emergency Information


Tentative syllabus

(topics that will most likely be covered at some point in the semester)

  • Parallel programming models (threads, arrays, streams, functional)
  • Parallel execution models (SPMD, MIMD, SIMD, streams, dataflow, VLIW)
  • Namespaces and locality
  • Hardware locality mechanisms (caches, scratchpad memories, registers)
  • Area and power models for communication and computation
  • Locality aware programming models
  • Locality and parallelism in real systems (SMP, DSM, stream processors)
  • Tools for increasing locality (domain decomposition, blocking, reordering)
  • Massively parallel processors and accelerators
  • Frameworks for programming massively parallel and heterogeneous platforms

Important Dates (tentative)

1/25Last official add/drop day
2/6Twelfth class day (last possibility for refunds on dropping and last day to add grad classes)
2/12Last chance for first progress update
3/18 - 3/23Spring Break
4/1Exam 1
4/3Project proposal expected
4/10Project proposal absolutely due
4/29Last date for third progress update
5/10Project officially due (late submissions accepted) and progress report due
5/15Project presentations likely + projects must be turned in