COL733 Cloud computing technology fundamentals
2023-2024 Sem I
This course introduces cloud infrastructure. Students should feel more comfortable with building cloud services after having done this course.
Course Information
- Prerequisites: COL331 or equivalent.
Note: The course includes programming assignments and thus expects proficiency with systems programming and debugging.
- Credits: 3-0-2
- Slot: AA, Mondays and Thursdays 2-3:15pm in LH521.
- TAs:
- Abhisek Panda: csz202445 AT cse.iitd.ac.in
- Ravi Ranjan Singh: jcs222659 AT csia.iitd.ac.in
- Reading material: There is no textbook for the course. Most lectures will link to more reading material. Lecture notes can be found here and here.
- Acknowledgements: Thanks to Robert T. Morris, MIT and Mythilli Vutukuru, IITB; parts of this course have been inspired by courses made available by them.
Grading criteria
- 30% labs (programming assignments)
- 20% project
- 10% quizzes
- 20% minor exam
- 20% major exam
Supporting systems
- Labs are to be done on Baadal. You will need VPN access to IITD network!
- Discussions should be done on Piazza.
Policies
- Audit criteria: 40% or more marks in total. 40% or more marks in major+minor exams.
- Ethics: Please re-read IITD honour code. Cheating will get an F in the course. Why should I not cheat?
- Late policy: To help you cope with unexpected emergencies, you can hand in your Labs solutions late, but the total amount of lateness summed over all the lab deadlines must not exceed 72 hours. You can divide up your 72 hours among the labs however you like; you don’t have to ask or tell us. You can only use late hours only for Labs.
- There will be no make up quizzes. We will count the scores from your best (n-1) quizzes where n is the total number of quizzes.
Tentative topics
Computation:
- Translate existing programs to distributed system. (Distributed shared memory)
- Batch computation (MapReduce, Spark), streaming computation (Spark streaming, Flink, Google Dataflow), ML training (Tensorflow)
- The problem of late data in streaming computation (Millwheel, Google dataflow): watermarks, triggers, windows.
- Fault tolerance strategies: re-run deterministic idempotent functions (MapReduce, Spark), asynchronous consistent checkpoints (Flink), inconsistent checkpoints (TensorFlow).
- Straggler mitigation, scalability, locality, etc.
Storage:
- PACELC theorem: If partitioned, choose between availability and consistency, else choose between latency and consistency.
- CP systems:
- Linearizability. Raft: quorums, leader election.
- Serializability. Google Spanner: distributed transactions, TrueTime, hybrid logical clocks.
- AP systems:
- Amazon dynamo: eventual consistency, hashing, gossip protocols, dotted version vectors, conflict-free replicated data types (CRDTs)
- Somewhere between CP and AP
- Google file system
- Zookeeper
- RedBlue consistency
Hardware-assisted virtualization:
- CPU virtualization: KVM, Popek-Goldberg theorem
- Memory virtualization: 2-d page tables
Disclaimer: Actual course contents may differ slightly depending on student interest. Reach out to the instructor as soon as possible if there is a particular interest in a topic.
Tentative Schedule
Week | Monday | Thursday | Sunday |
---|---|---|---|
1 | 24 Jul LEC 1: Introduction. |
27 Jul LEC 2: Scalability, Task DAGs, FaaS. Ch.5 of Introduction to Parallel Computing |
|
2 | 31 Jul LEC 3: Struggles with DSM. |
3 Aug LEC 4: MapReduce. Release Lab 1. |
|
3 | 7 Aug LEC 5: Spark: Resilient distributed datasets. |
10 Aug LEC 6: TensorFlow operational semantics. |
12 Aug Lab 1 due |
4 | 14 Aug LEC 7: TensorFlow. |
17 Aug LEC 8: CIEL. Release Lab 2. |
|
5 | 21 Aug LEC 9: CIEL. Quiz 1. |
24 Aug LEC 10: Spark streaming. |
26 Aug Lab 2 due |
6 | 28 Aug LEC 11: Virtual time and global states. |
31 Aug LEC 12: Flink. Quiz 2. |
|
7 | 4 Sep Mental health discussion. Release Lab 3. |
7 Sep Janmashtami |
|
8 | 11 Sep LEC 13: Dataflow model |
14 Sep Mid-term exam |
17 Sep Lab 3 due |
9 | 18 Sep LEC 14: GFS. Discuss Lab 3 |
21 Sep LEC 15: GFS |
|
10 | 25 Sep LEC 16: Raft. Release Lab 4 |
27 Sep Wednesday in-lieu of holiday on 28th LEC 17: Raft. Release project. |
1 Oct Lab 4 due |
11 | 2 Oct Semester break |
5 Oct Semester break |
|
12 | 9 Oct LEC 18: Zookeeper. Quiz 3. |
12 Oct LEC 19: CRAQ. Dynamo. |
|
13 | 16 Oct LEC 20: Dynamo. Release Lab 5. |
19 Oct LEC 21: Dotted version vectors. |
|
14 | 23 Oct No class day |
26 Oct LEC 22: CRDT. |
29 Oct Lab 5 due |
15 | 30 Oct LEC 23: Spanner. |
2 Nov LEC 24: Spanner. Why virtualization? Quiz 4. |
|
16 | 6 Nov Project presentations |
9 Nov Project presentations |
|
17 | 13 Nov Govardhan Puja |
16 Nov LEC 25: Hardware-assissted virtualization |
17 Nov Project due |