COL733 Cloud computing technology fundamentals
2021-2022 Sem II
This course introduces cloud infrastructure. Students should feel more comfortable with building and deploying their cloud services after having done this course.
Course Information
- Prerequisites: COL331 or equivalent.
Note: The course includes programming assignments and thus expects proficiency with systems programming and debugging.
- Credits: 3-0-2
- Slot: AB, Mondays and Thursdays 3:30-4:45pm in MS Teams.
- TAs:
- Nutesh Sahu: jcs212242 AT csia.iitd.ac.in
- Soumen Basu: soumen.basu AT cse.iitd.ac.in
- Abhisek Panda: csz202445 AT cse.iitd.ac.in
- TA Office hours: TBD
- Reading material: There is no textbook for the course. Most lectures will link to more reading material.
Grading criteria
- 30% labs (programming assignments)
- 20% project
- 10% assignments
- 20% minor exam
- 20% major exam
Supporting systems
- Lectures will be held in the course Teams channel.
- Assignments will be regularly released on gradescope.
- Labs are to be done on Baadal. You will need VPN access to IITD network!
- Discussions should be done on Piazza.
Acknowledgements
Thanks to Robert T. Morris, MIT and Mythilli Vutukuru, IITB; parts of this course have been inspired by courses made available by them.
Policies
Audit criteria
30% or more marks.
Ethics
We will employ various methods to catch cheating. Cheating in labs/assignments will bring zero in that lab/assignment.
Late policy
- To help you cope with unexpected emergencies, you can hand in your Labs solutions late, but the total amount of lateness summed over all the lab deadlines must not exceed 72 hours. You can divide up your 72 hours among the labs however you like; you don’t have to ask or tell us. You can only use late hours only for Labs.
- Assignments can not be submitted late. 1 assignment in the course can be skipped without penalty.
- COVID addendum: In case you’re affected with an illness, including COVID-19, you can send upto 1 assignment late by 1 week and upto 1 lab late by 1 week by emailing Soumen. Please attach a proof of illness in the email. This can only be used once in the semester and does not affect the other late policy. In other words, in addition to the 1 1-week late assignment, another assignment can be skipped without penalty. Similarly, 3-day extension can be used for the other two labs.
Tentative topics
- Virtualization: containers, orchestration, hypervisors
- Recoverability: journaling, snapshotting
- Fault tolerance: state transfer, replicated state machines
- Consistency and availability: PACELC theorem
- Storage Scalability: sharding, consistent hashing
- Cloud programming: dataflow computation, pub-sub, locking, transactions
- Light coverage of other topics: cloud economics, public cloud offerings, security
While discussing these topics, we plan to study popular cloud offerings: containers such as docker, orchestration in k8s, key-value stores such as Redis, co-ordination service such as Zookeeper, SQL/NoSQL databases, distributed file systems such as HDFS, pub-sub system Kafka, and dataflow computation in Spark.
Disclaimer: Actual course contents may differ slightly depending on student interest. Reach out to the instructor as soon as possible if there is a particular interest in a topic.
Tentative Schedule
Week | Monday | Thursday | Sunday |
---|---|---|---|
1 | 3 Jan LEC 1: Introduction. |
6 Jan LEC 2: What is scalability? Task DAGs. Ch.5 of Introduction to Parallel Computing |
|
2 | 10 Jan LEC 3: Fault-tolerant embarrasingly parallel programs. MapReduce |
13 Jan LEC 4: Work pool model. Introduce Lab 1. Celery Optional: Celery at Instagram |
|
3 | 17 Jan LEC 5: Struggles with Distributed shared memory. DSM survey. |
20 Jan LEC 6: Resilient Distributed Datasets. Spark. |
23 Jan Lab 1 DUE |
4 | 24 Jan LEC 7: Streaming computation as mini-batches. Spark streaming. |
27 Jan LEC 8: Real-time stateful streaming (Flink). Introduce Lab 2. Lightweight Asynchronous Snapshots. Redis streams. |
|
5 | 31 Jan LEC 9: Large-scale ML. TensorFlow |
3 Feb LEC 10: Google file system. GFS |
6 Feb Lab 2 DUE |
6 | 7 Feb LEC 11: Revisit cycles in real-time stateful streaming. Introduce projects. Lightweight Asynchronous Snapshots |
10 Feb LEC 12: Amazon Dynamo: Decentralization. Dynamo, Gossip protocol in cassandra |
|
7 | 14 Feb Minors |
17 Feb Minors |
20 Feb Project proposal DUE |
8 | 21 Feb LEC 13: Amazon Dynamo: Eventual consistency. Introduce Lab 3. Dynamo, CRDT |
24 Feb LEC 14: Replicated state machines, leader election in Raft. Raft |
|
9 | 28 Feb Semester break |
3 Mar LEC 15: Other safety properties in Raft. Linearizability. Raft |
6 Mar Lab 3 DUE |
10 | 7 Mar LEC 16: Improve read throughput, give up on linearizability of reads. Zookeeper |
10 Mar LEC 17: Distributed transactions. Serializability, 2-phase commit. |
|
11 | 14 Mar LEC 18: OS background for virtualization. OS book |
17 Mar LEC 19: Popek-Goldberg theorem. CPU/memory Paravirtualization in Xen. |
|
12 | 21 Mar Instructor affected by viral. Makeup class on Apr 2. |
24 Mar LEC 20: I/O virtualization. Parts of VMWare paper |
27 Mar Project DUE |
13 | 28 Mar Project presentations Self-study: Containers: Lec 11 |
31 Mar Project presentations Self-study: Containers: Lec 11 |
2 Apr LEC 21: Hardware assisted virtualization. KVM Nested paging |
Encouraging student comments after the course
I found this course to be very thought provoking and knowledgeable. I feel it has given me a wide variety of knowledge in the systems domain. All the topics covered in class were presented in a very connected manner which really helped grasp the bigger picture and also with the retention of knowledge. Overall, I quite liked the course and contents covered here.
Thanks for organising such a great course where we learn many new concepts which are never taught anywhere we just see the implementations of these concepts in real world. I had never seen such transparency in rubrics, grading in any course. I enjoyed and learned at the same time in entire duration of course.
It was a very good course. Had a lot of fun learning about distributed systems.
Exams: