Analysis of Concurrent and Distributed Programs
CS 4405, Spring 2024
Course description:
Software systems are becoming highly concurrent and distributed to utilize modern multicore architectures and increasing speed and bandwidth in networks. Shared-memory concurrency in multicore programs and message-passing concurrency in distributed programs share many common abstractions and problems.
In the multicore era, all performance-critical software employs some form of concurrent programming; typically shared memory concurrency. In this setting, programmers use a number of primitives to develop efficient and correct concurrent programs. To do so the programmers have to understand the behaviors of the primitives and reason about them. It is also important to match the programming paradigms and underlying architectures. For instance, traditionally programmers have assumed that a multithreaded program executed simply by interleaving the executions of its threads—a model known as sequential consistency (SC). This assumption is, however, invalidated both by mainstream multicore architectures, which often execute instructions out of order, and by compilers, whose optimizations affect the outcomes of concurrent programs. As a result, concurrent programs have more outcomes than SC allows.
In the distributed setting, the units of concurrency are independent processes that do not share memory but communicate by exchanging asynchronous messages. The execution of such a system involves two main sources of nondeterminism: concurrency and partial failures. As the processes run concurrently, the exchanged messages can be delivered and processed in many different orderings. The distributed set of processes is also prone to network of process failures. The trade-off between the system’s availability in the existence of failures and the consistency between the processes gives rise to a spectrum of weak consistency notions. It is important to reason about concurrency, possible failures, and consistency guarantees to implement distributed programs correctly and understand their behavior.
This course aims to explore analysis techniques for concurrent and distributed programs.
Course organization
- 5 ECTS: You need to devote at least 140 hours of study for this course.
- Online meetings: The course consists of 10 2-hour meetings. You are not required, but you are strongly encouraged, to attend.
- Assignments: Two homework assignments ((2 x 10%) of the course grade)
- Project: Project implementation, report, and a presentation (80% of the course grade)
- Teams: The students are responsible to form teams and communicate them to the course TAs.
Study goals:
By the end of this course, you should be able to:
- Describe the fundamental concurrency models in multicore and distributed systems
- Explain the concurrency nondeterminism in the executions of concurrent and distributed programs
- Discover concurrency bugs in multicore and distributed programs by program analysis and testing techniques
- Evaluate the pros and cons of program analysis and testing techniques for specific multicore and distributed programs
Teachers:
Teaching assistants
Course schedule:
- Feb 12
- Feb 14
- Weak-memory concurrency – I
- SC
- Slide
- Feb 19
- Weak-memory concurrency – II
- SC
- Slide
- Feb 21
- Programming language concurrency and compiler correctness
- SC
- Slide
- Feb 26
- Concurrency analysis for multithreaded programs
- SC
- Slide
- Feb 28
- Distributed concurrency
- BKO
- Introduction to distributed systems
- Time, order, and causality
- Distributed consensus - I
- Mar 04
- Concurrency analysis of distributed systems - I
- BKO
- Distributed consensus - II
- Concurrency analysis for distributed systems
- Mar 06
- Concurrency analysis of distributed systems - II
- BKO
- Concurrency and fault-tolerance bugs in distributed systems
- Concurrency testing of distributed systems
- Mar 11-13
- No lectures
- Mar 18
- Consistency and isolation in distributed systems
- BKO
- Replication and Consistency
- Strong consistency: SC and Linearizability
- Weak consistency and isolation (Weak isolation is for optional read)
- Mar 27
- In-progress project evaluation
- Apr 03
- In-progress project evaluation
- Apr 15-17
- Final project presentations
The schedule is subject to small changes during the term.
You can access the last year’s edition here.
Course projects:
Testing Distributed Systems - 1: Checking linearizability and SC of distributed databases
Testing Distributed Systems - 2: Testing consensus implementations
Alternatively, you can suggest your own project that involves replicating an existing work or proposing a new approach to an existing problem. If you have exciting ideas for the course project, contact the TA’s and the teachers to determine the context of your project.
Example projects from the previous editions:
- Testing consistency of the rqlite distributed database (by Nienke Eijsvogel, Ruben van Baarle, Daan de Graaf) [GitHub] [Hacker News]