2021-2022 Sem I

This course will survey current important research directions in systems. The course targets entry-level graduate students and upper-level undergraduate students.

In almost every lecture, we will discuss a research paper. First half hour of the discussion will be led by a student presenter. Each student is expected to submit a paper review for every paper.

Because you learn systems the best when you build one, this course includes a semester-long project which can be entrepreneurial, or explores a research hypothesis. Near the end of the course, students present their projects. For research projects, students showcase reasonable experiments that validate/invalidate their research hypotheses. For enterpreneurial projects, students should make a business case with a plan to validate/invaldate their market hypotheses and showcase a minimum viable product. Handful of projects will be marked as “best projects” that will continue to receive instructor support after the completion of course.

Prerequisites

COL331 or equivalent.

Tentative evaluation criteria

  • Course project: 35%
    • This includes project proposal presentation, project report, and project final presentation.
  • Paper presentation: 30%
    • This includes presenting papers and creating a questionnaire on that paper to engage the classroom.
  • Paper reviews: 35%
    • This includes writing a summary of the paper and your commentary on what could be improved.

Ethics

Paper reviews: You are encouraged to discuss the paper with your peers. But each review must be written independently. If two student reviews are found to be copied, both reviews will be penalized. It is encouraged to note in the header: “Paper discussed with student XYZ”.

Paper presentation: You are allowed to reuse slides from other presentations. But this has to be acknowledged at the start of presentation.

Project: Report and presentations should clearly distinguish between original work and cite work from other sources.

Enterpreneurial projects

Since this is a systems course, the basis of project evaluation for grading will be on systems work. Project proposals shall declare the systems metrics they should be evaluated upon.

Sample project idea

Video indexing service

  • Problem: Lot of content is being recorded as videos. But, long videos are tedious to watch.
  • Solution: Submit a video URL or upload a video, get indexing information back, such as the one shown nowadays on YouTube. Indexing can also sync a video with a presentation, both provided by the user.
  • Course evaluation focus: Batch computation backend that can index large amounts of video.
    • Metrics: Scalability, resource consumption, packaging.
      • Scalability: Does doubling number of servers double the number of videos that service can handle?
      • Resource consumption: Since renting servers is costly, does the service keep all the rented servers busy?
      • Packaging: Since enterprises may not want to send their private videos to a third party, can the service be deployed easily in private clusters?
      • ML model accuracy is not part of the evaluation.

Invalid project idea

  • Explaining computer systems through animation such as this channel for Math.
    • Though useful, it doesn’t involve building systems.

Tentative schedule

  1. Introduction: Course overview, research basics, synergy with startups.
  2. Data flow computation: Mapreduce, idempotence, immutability, fault tolerance, scheduling.
  3. Dryad
  4. Spark
  5. TensorFlow
  6. Project pitches
  7. MillWheel
  8. Noria
  9. Big memory: big memory workloads, paging, huge pages, TLB, instruction cache, L1,L2,L3 cache, mmap, virtual memory areas, soft page faults, memory allocators.
  10. Bonsai
  11. Ingens
  12. Illuminati
  13. RAMCloud
  14. Persistent memory: device characteristics, byte addressability, crash consistency: ordering and write ahead logging, register, store buffer, caches, fences, clflushes, write pending queue, eADR.
  15. PMFS
  16. Espresso
  17. Whole system persistence
  18. Record and replay: Profilers (CPU profiler, edge profiler, memory profiler, energy profiler), tracers.
  19. Valgrind
  20. Atom
  21. Pintools
  22. ODR: Output-deterministic replay for multicore debugging.
  23. Dthreads: efficient deterministic multithreading.
  24. Eidetic systems
  25. REPL
  26. Performance: Program slicing, Profile guided optimization.
  27. Shortcut
  28. lwC
  29. Final project presentation
  30. Final project presentation