CSCI 8960 Privacy-Preserving Data Analysis

Jaewoo Lee

Spring 2019

E-mail: jwlee@cs.uga.edu Web: www.cs.uga.edu/~jwlee
Office Hours: Tue. 2:00 - 3:00pm (or by appointment)
Office: BOYD 620
Class Hours: Tue./Thu. 12:30-1:45pm, Wed. 12:20 - 1:10pm
Class Room: Forest Resources-1 (Tue. and Thu.), Dawson Hall 312 (Wed.)

The best way to schedule an appointment outside my office hours is to send me an email with some good dates/times that work for you. I will pick one and reply as quickly as I can.

Course Description

These days almost every single human activity is being monitored and stored electronically. The potential benefits of analyzing such data are paramount but there is an obstacle that prohibits the analysis, concerns about privacy of individuals in the data. In this course, we will study a mathematical framework and algorithmic techniques for privacy-preserving analysis, which enables analyzing sensitive data. The main notion of privacy we will use in this course is differential privacy, a rigorous notion of privacy that provides strong privacy protection. Especially, our focus will be on designing differentially private algorithms for popular machine learning tasks such as regression, prediction, recommendation, and so on.

Textbook

Additional materials

Prerequisites/Corequisites

Prerequisites: 
CSCI 1302 (Software development), CSCI 2610 (Discrete math), CSCI 2720 (Data structures)

Recommended Corequisites: 
CSCI 4380/6380 (Data mining) or CSCI 8950 (Machine learning)

Computing Resource

The main programming language we will be using in this course is PYTHON. All the programming assignments will be done in Python using the Numpy and Scipy package, but prior knowledge of Python is not required. Basic tutorial on how to program in python and how to use numpy and scipy will be taught in class.

Course Objectives

Successful students:

  1. will be able to decide, given an application, if it should be formulated as a data privacy problem. If yes, the students will be able to formally define the problem and state what properties can be guaranteed by applying differential privacy.
  2. will have understanding of how (and why) randomness (or uncertainty) provides privacy protection.
  3. will be able to analyze real-world privacy problems, identify which privacy-preserving methods are appropriate, and implement the private algorithms in code.
  4. will be able to evaluate and compare privacy-preserving algorithms.

Grading Policy

Evaluation will consist of two individual homework, two paper presentations, course project, and random number of quizzes. Each submitted item will be graded on a 100-point scale without a curve but I reserve the right to curve the scale dependent on overall class scores at the end of the semester. Any curve will only ever make it easier to obtain a certain letter grade.


Item

Portion

Description

Homework

20%

2 individual assignments

   

Presentations

20%

Paper reading and presentation

Deliverables: slides and 1 page summary

   

Course project

40%

Implementation of private algorithms or theoretical analysis

Deliverables: 2 reports (interim and final), project code, a demo of your project

   

Quizzes

20%

pop quizzes


The grade will be given based on the total scores, a weighted sum of collected graded items. It is computed using the following equation:

             ∑-2i=1-HWi-        ∑-2i=-1PTi                    ∑ki=1Q-uizi
total score =    2     × 0.2 +     2    × 0.2 + Proj× 0.4+      k     × 0.2

Course Policies

Attendance Policy

Attendance is expected in all lectures but is not a part of grade determination. For complete attendance and excused absence policies, please see https://provost.uga.edu/policies/academic-affairs-policy-manual/4-06-class-attendance/.

Assignment Submission Policy

All assignments and deliverables are to be submitted via eLC. Allowed file formats are .pdf, .ipynb, and .py. If you need to include images in your answers, you can either embed them into the pdf file or upload the images to cloud storages and include only the links in your solution. Email attachments are not considered as an official submission and will not be graded.

Policies on Late Assignments

All assignments are expected to be completed and submitted to the eLC by due date. Normally, assignments are due by 11:59pm on Fridays. Any assignment submitted after 00:01 am on the following day of due date will be considered late. Late submission will be penalized by deducting 20% of total marks for the assignment for each day/partial day (including weekend days) beyond the due time. Note that if the assignment is not turned in 5 days after the deadline it will not be accepted.

Academic Integrity and Honesty

For all students enrolled in this course, it is assumed that they will abided by UGA’s academic honesty policy and procedures. Please refer to UGA’s a culture of honesty found at https://honesty.uga.edu/Academic-Honesty-Policy/Introduction/. All the linked documents in the url is a part of this syllabus. 

For every individual assignment, students are welcome to discuss the problems and share ideas at high level. This means that you should not share anything concrete such as write-up, code fragments, or your laptop screen. The submitted item must be a work of yours. For example, you can discuss how to solve a homework problem and share an idea, but you have to write your own answer/code. An egregious violation of these academic honesty codes will results in F for the course.

Accommodations for Disabilities

Reasonable accommodations will be made for students with verifiable disabilities. In order to take advantage of available accommodations, students must register with the Equal Opportunity Office at Suite 119 in Holmes-Hunter Academic building. For more information on UGA’s policy on working with students with disabilities, please see https://eoo.uga.edu/disability-services.

Discrimination based on race, color, religion, creed, sex, national origin, age, disability, veteran status, or sexual orientation is a violation of state and federal law and will not be tolerated. Harassment of any person (either in the form of quid pro quo or creation of a hostile environment) based on race, color, religion, creed, sex, national origin, age, disability, veteran status, or sexual orientation also is a violation of state and federal law and/or UGA’s policy and will not be tolerated. Retaliation against any person who complains about discrimination is also prohibited. UGA’s policies and regulations covering discrimination, harassment, and retaliation may be accessed at .

Tentative Topics and Schedule

The schedule is tentative and subject to change. Each quiz will test on the material that was taught up to the date of the quiz.

Week 01, 01/01 - 01/03: Introduction

Week 02, 01/08 - 01/10: Review: probability and tutorial on Numpy

Week 03, 01/15 - 01/17: Foundation of Differential Privacy

Week 04, 01/22 - 01/24: Local Privacy Models

Week 05, 01/29 - 01/31: Convex Optimization

Week 06, 02/05 - 02/07: Relaxations of Differential Privacy

Week 07, 02/12 - 02/14: Paper Presentation I

Week 08, 02/19 - 02/21: Paper Presentation I cont’d

Week 09, 02/26 - 02/28: Foundation of deep learning

Week 10, 03/05 - 03/07: Spring Break - No Classes

Week 11, 03/12 - 03/14: Neural Networks for Supervised Learning

Week 12, 03/19 - 03/21: Neural Networks for Generative Models

Week 13, 03/26 - 03/28: Paper Presentation II

Week 14, 04/02 - 04/04: Paper Presentation II cont’d

Week 15, 04/09 - 04/11: Final Project Presentations

Week 16, 04/16 - 04/18: Final Project Presentations Cont’d

Week 17, 04/23 - 04/25: Classes end on Apr. 30