Traditional computer science emphasizes the design and analysis of sequential algorithms, which are meant to be executed on a single computing device. With the rise of large-scale systems, however, came a new algorithmic scenario: multiple computing devices, connected by communication channels or shared objects, that must work together to solve problems. These distributed systems require distributed algorithms.
The world of distributed algorithms is quite different than the world of sequential algorithms. It is a setting where time complexity is often obvious or not relevant, but proving that an algorithm works is difficult. There is less algebra deployed in analyzing distributed algorithms, but more mind-bending logic arguments. Distributed algorithm theorists almost never utter the phrases "master theorem" or "dynamic programming," but instead talk about "byzantine adversaries" and "indistinguishability." Perhaps most interesting, in distributed computing, many problems that seem easy turn out to be impossible to solve, while some difficult sounding puzzles yield simple solutions.
This once quirky corner of computer science has recently become crucial as more and more of the infrastructure supporting our current digital age leverage big distributed systems tamed by smart distributed algorithms. For companies like Facebook, Google, and Amazon, which depend on somehow coaxing hundreds of thousands of machines spread over massive data centers to work well together, this subject has become a necessity. (Which makes this topic particularly good to know for those interested in entering the technology job market!)
This course is listed as a graduate elective and therefore its primary audience is computer science masters and doctoral students. The material, however, is also suitable for senior undergraduate computer science majors interested in graduate study or pursuing jobs at competitive technology companies. If you are an undergraduate interested in this course, please contact me.
The course is divided into three parts. The first two parts cover the fundamentals of designing and analyzing distributed algorithms. The third part covers a selection of more recent results in the study of distributed algorithms.
There will be three non-cumulative exams for the course, one for each of the three parts. There are no problem sets in this course (though I am happy to assign and discuss practice exercises by student request). This course focuses on the design and analysis of distributed algorithms from a theoretical perspective, not their implementation. There are, therefore, no programming assignments.
The required textbook for this course is Distributed Algorithms, Nancy Lynch, 1996. Most of the topics covered in the first two thirds of this course draw from this textbook, though I will also cover material not in this book.
Grades in the course will be based on your three exams. Class attendance is highly recommended as I will cover material not included in the textbook, as well as cover material from the book in a new manner.
I hold regular office hours in my office at 334 Saint Mary's Hall from 12:30 to 1:30 on Tuesday and Thursday.
There will be three exams in the course: two midterms and a final. The exams are non-cumulative (that is, each covers only the new material since the last exam).
Below is the tentative schedule for the course. I will continue to modify it as the semester progresses. The readings column lists the chapters from the Lynch textbook that best correspond to the lecture's material. As noted, however, not all of the material covered is in the textbook, and I will cover some of the textbook material in a new manner.
Note: The syllabus for the third part of the semester will be set during the semester.
|Part 1: Synchronous Networks|
|1||1/14||Introduction to distributed algorithms; course overview|
|2||1/19||Leader election in rings: basic algorithms and impossibilities.||3.1 - 3.3|
|3||1/21||Leader election in rings: better algorithms.||3.4 - 3.5|
|4||1/26||Solving problems in general network graphs: leader election, breadth-first search, and shortest path.||4.1 - 4.3|
|5||1/28||Minimum spanning trees. ||4.4|
|6||2/2||Impossibility of consensus with link failures, algorithms for consensus with process failures.||5.1, 6.1 - 6.2|
|7||2/4||A lower bound on consensus with process failures.||Keidar and Rajsbaum, 2003|
|8||2/9||Set-consensus and distributed commit.||7.1, 7.3|
|9||2/11||Byzantine consensus: algorithms and
|Part 2: Asynchronous Shared Memory and Network Systems|
|12||2/23||Shared memory model and mutual
exclusion with registers.||10.1 - 10.3|
|13||2/25||Improved mutual exclusion with registers.||10.5, 10.7|
|14||3/1||The dining philosophers problem.||11.1 - 11.3|
|15||3/3||The impossibility of consensus with process
|3/8||No Class: Spring Break|
|3/10||No Class: Spring Break|
|16||3/15||Asynchronous network model: solving leader election in rings and building spanning trees.||15.1, 15.3|
|17||3/17||Connecting the asynchronous shared memory and network models||17.1 - 17.2|
|18||3/22||Connecting the asynchronous shared memory and network models (cont); the PAXOS consensus algorithm.||17.1 - 17.2; Lamport 2001|
|3/24||No Class: Easter Break|
|19||3/29||Partially synchronous systems: basic results
for mututal exclusion and consensus.||24.1 - 24.2, 25.1 - 25.3|
|Part 3: Recent and Advanced Results|
|22||4/7||The wait-free hierarchy.||Herlihy 1991|
|23||4/12||Random rumor spreading.||Giakkoupis 2011|
|24||4/14||Consensus with limited network information.||Newport 2014|
|25||4/19||Dynamic networks.||Kuhn, Lynch, and Oshman 2010|
|26||4/21||Radio networks.||Bar-Yehuda, Goldreich, and Itai 1992|
|27||4/26||Lower bound for radio networks.||Newport 2013|
|28||4/28||Overflow; Exam review.|
|Final examination (date and location to be announced)|