Advanced Distributed Systems

Fall 2011

General Course Information:

Instructor: Lakshmish Ramaswamy (laks[AT]cs[dot]uga[dot]edu, 706-542-2737)
 
Time and Venue: Wednesdays - 2:30 PM to 3:20 PM (Aderhold 306) ; Tuesdays & Thursdays - 2:00 PM to 3:15 PM (Aderhold 625)
 
Office Hours: TBA

Course Description:

Distributed systems have become widely pervasive and are having a tremendous impact on various domains of human activity. Today's distributed systems range from ad-hoc networks comprising of tiny sensor devices, to overlay networks such as peer-to-peer systems, to massive web farms of powerful servers. The research in the area of distributed systems has focused on achieving better performance, reliability, security, and privacy of various kinds of distributed data processing applications.

In this course we will study the design, implementation and evaluation of a wide class of distributed systems including content distribution networks,  peer-to-peer systems, mobile systems, sensor networks, and
publish-subscribe systems, stream processing systems, with the objective of gaining an in-depth understanding of the requirements and the design options.

Grading Policy (Tentative)

Course Materials (Tentative -- Will be modified during the course of the semester)

Introduction to Distributed Systems
   
   
Material from the book "Distributed Systems: Principles and Paradigms" by Tanenbaum and Van Steen, 2nd Ed. Pearson Prentice Hall.

Clusters, Data Centers and Cloud Computing

General Reading Materials -- 
Hadoop: The Definitive Guide, The Datacenter as a Computer (Barroso and Holzle) and Above the Clouds: A Berkely View of Cloud Computing (Armburst et al). and "The Eucalyptus Cloud Computing System" (Nurmi et al.). Students are strongly encouraged to read these materials.
  1. Introduction to Data centers, MapReduce, Hadoop clusters and Cloud Computing (Lakshmish -- 09/06)  -- Will use materials from the above books/articles.
  2. J. Dean and S. Ghemawat, "MapReduce: Simplified Data Processing on Large Clusters", OSDI 2004 (PR: Aniruddha; DL: Venkata -- 09/07).
  3. R. Lee et. al., YSmart: "Yet Another SQL-to-MapReduce Translator", ICDCS 2011 (PR: Alok; DL: Manuel -- 09/08)
  4. S. Agarwal, J. Donagan, N. Jain, S. Saroiu A. Wolman and H. Bhogan "Volley: Automated Data Placement for Geo-Distributed Cloud Services", NSDI 2010 (PR: Sagar; DL: Sivavenkat: 09/13).
  5. N. Laotaris, M. Sirivianos, X. Yang and P. Rodriguez "Inter-Datacenter Bulk Transfers with NetStitcher", SIGCOMM 2011 (PR: Amna -- 09/14).
  6. A. Qureshi, R. Weber, H. Balakrishnan, J. Guttag and B. Maggs, "Cutting the Electric Bill for Internet-Scale Systems", SIGCOMM 2009 (Michael; DL: Vinay and Muthu--09/15).
  7. H. A. Lagar-Cavilla et al. "SnowFlock: Rapid Virtual Machine Cloning for Cloud Computing", EuroSys 2009 (PR: Anuj; DL: Amna -- 09/20)
  8. M. Satyanarayanan et al. "The Case for VM-based Cloudlets in Mobile Computing", IEEE Pervasive 2009 (PR: Vic; DL: Jiva and Michael -- 09/21)
  9. U. Sharma et al. "A Cost-Aware Elasticity Provisioning System for Cloud", ICDCS 2011 (PR: Akshay; DL: Shima and Gowtham -- 09/22)
  10. A. Pavlo et al., "A Comparison of Approaches to Large-Scale Data Analysis", SIGMOD 2009 (PR: Chan il Park; DL: Jiva & Vic -- 10/04)

Large-Scale Data Storage, Retreival and Analytics

  1. F. Chang, et al., BigTable: A Distributed Storage System for Strucutured Data, OSDI 2006 (PR: Manuel; DL: Shima and Enrico -- 10/05)
  2. A. Gates et al., Building a Highlevel Dataflow System on Top of MapReduce: The PIG Experience, VLDB 2009 (PR: Enrico; DL: Vic -- 10/06).
  3. S. Trissl and U. Lesser, Fast and Practical Indexing and Querying for Very Large Graphs, SIGMOD 2007 (PR: Venkata Sriram; DL: Sagar -- 10/11).
  4. H. Yildirim, V. Chaoji and M. J. Zaki, "GRAIL: Scalable Reachability Index for Large Graphs", VLDB 2010 (PR: Gowtham; DL: Venkata Sriram -- 10/12).
  5. U. Kang, et al. "PEGASUS: A Peta-Scale Graph Mining System - Implementation and Observations", ICDM 2009 (PR: Vinay; DL: Alok -- 10/13)
  6. C. Ren, E. Lo, B. Kao, X. Zhu and R. Cheng "On Querying Historical Evolving Graph Sequences", VLDB 2011 (PR: Siva Venkat; DL: Muthu and Aniruddha -- 10/18)
  7. L. Zou et al. "gStore: Answering SPARQL Queries via Subgraph Matching", VLDB 2011 (PR: Shima; DL: Amna and Jiva -- 10/19)

Data Stream Processing and Event-based Systems

General reading material -- The Many Faces of Publish-Subscribe, (Eugster et al.), Design and Evaluation of a Wide-Area Event Notification Service (Carzaniga et al.), Monitoring Streams: A New Class of Data Management Applications (Carney et al.)
  1. D. J. Abadi et al. The Design of the Borealis Stream Processing System, CIDR 2005 (PR: Alok; -- 11/07).
  2. J-H. Hwang, U. Cetintemel and S. Zdonik, "Fast and Highly-Available Stream Processing over Wide-Area Networks", ICDE 2008 (Akshay -- 11/07).
  3. S. Krishnamurthy et al., "Continuous Analytics over Discontinuous Streams", SIGMOD 2010 (PR: Muthukumar -- 11/08).
  4. N. Backman et al. "C-MR: A Continuous MapReduce Processing Model for Low-Latency Stream Processing on Multi-Core Architectures", Technical Report, Brown University 2010 (PR: Chan Il; DL: Aniruddha and Amna -- 11/09).
  5. I. Rose, R. Murty, P. Pietzuch, J. Ledlie, M. Roussopoulos and M. Welsh "Cobra: Content-based Filtering and Aggregation of Blogs and RSS Feeds", NSDI 2007 (PR: Jiva; DL: Michael and Enrico --11/09).
  6. M. Akdere, U. Cetintemel, N. Tatbul, Plan-based Complex Event Detection across Distributed Sources, VLDB 2008 (PR: Vinay; DL: Vic and Shima; 11/14).
  7. J. Chen, L. Ramaswamy, D. K. Lowenthal and S. Kalyanaraman, Comet: Decentralized Complex Event Detection in Delay Tolerant Networks, Technical Report, UGA, 2011 (Copies of the paper will be distributed in the class) (PR: Michael; DL -- Manuel and Vic -- 11/14).
Pervasive Computing (Includes Mobile, Sensors and Location-Aware Services)

General reading material:  Pervasive Computing: Vision and Challenges
  1. S. R. Madden, M. J. Frankilin, J. M. Hellerstein and W. Hong, "TinyDB: An Acquisitional Query Processing System for Sensor Networks", ACM TODS 2005 (Open).
  2. E. Welbourne, N.Khoussainova, J. Letchener, Y. Li, M. Balazinska, G. Borriello and D. Suciu, "Cascadia: A System for Specifying, Detecting and Managing RFID Events", MobiSys 2008 (PR: Aniruddha; DL: Sagar -- 11/17)
  3. J. Lu, T. Sookoor, G. Ge, V. Srinivasan, B. Holben, J. Stankovic, E. Field and K. Whitehouse "The Smart Thermostat: Using Wireless to Save Energy in Homes", SenSys 2010 (PR: Amna; DL: Akshay -- 11/17).
  4. M. B. Kjaergaard, J. Landgal, T. Godsk and T. Toftkjaer, "En Tracked: Energy-Efficient Robust Position Tracking for Mobile Devices", MobySys 2009 (PR: Vic; DL: Aniruddha and Muthu -- 11/29).
  5. M. F. Mokbel X. Xiong and W. G. Aref, SINA: Scalable Incremental Processing of Continuous Queries in Spatio-temporal Databases, SIGMOD 2004 (Open)
Security and Privacy in Distributed Systems
  1. Z. Zhong, L. Ramaswamy and K. Li, ALPACAS: A Large-Scale Privacy-Aware Collaborative Anti-Spam System, INFOCOM 2008 (PR: Sagar, DL: Amna -- 11/29).
  2. J. A. Calandrino et al., "You Might Also Like: Privacy Risks of Collaborative Filtering" IEEE Security and Privacy 2011 (PR: Siva Venkat; DL: Vinay -- 11/29).
  3. R. Wang, S. Chen, X. Wang and S. Quadeer, "How to Shop for Free Online: Security Analysis of Cashier-as-a-service Based Web Stores", IEEE Security and Privacy 2011.(PR: Shima; DL: Enrico and Jiva -- 11/30).
  4. M. Mulazzani et al."Dark Clouds on the Horizon: Using Cloud Storage as Attack Vector and Online Slack Space", USENIX Security 2011 (PR: Anuj -- 11/30).
  5. C. Mulliner et al. "SMS of Death: from analyzing to attacking mobile phones on a large scale:, USENIX Security 2011 (PR: Muthu; DL: Shima and Enrico -- 12/01).
  6. K. Thomas et al. "Design and Evaluation of a Real-Time URL Spam Filtering Service" IEEE Security and Privacy 2011 (PR: Gowtham -- 12/01).

Presentation Slides

Available on ELC.

Miscellanious Materials