CSCI 3360: Data Science I (Spring 2019)

Instructor Sheng Li
E-mail: sheng.li@uga.edu
Time and Location of Lecture TR: 12:30 pm - 1:45 pm Chemistry 551
W: 12:20 pm - 1:10 pm Dawson Hall 208
Instructor Office Hours and Location Wednesday: 1:10 pm - 2:30 pm or by an email appointment.
Boyd GSRC 549
TA Office Hours and Location TA: Hiten Nirmal (hn97292@uga.edu)
Monday: 1:15 pm - 2:15 pm
Wednesday: 11:15 pm - 12:15 pm
LAB 307, Boyd GSRC

Course Description

This course presents a rigorous overview of methods for data mining, image processing, natural language processing, and scientific computing. Core concepts in supervised and unsupervised analytics, dimensionality reduction, deep learning, and data visualization will be explored in depth. Please refer to the syllabus for more information.

Textbooks

The main textbook for this course is:

An Introduction to Statistical Learning with Applications in R” by Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani. Springer.

The PDF version of this book is available on the author's homepage.

Grading

Section Portion Description
Homework 35% 5 individual assignments involving problem solving and programming
Exams 35% Midterm (15%) and Final (20%)
Team Project 25% Project proposal (5%); Progress review (5%); Final presentation and report (15%)
In-class Participation 5%

Homework Submission: Homework should be submitted to the eLC by due date (11:59pm).

Late Submission Policy: Late submissions will be penalized by deducting 10% of the score for each day beyond due time.

Exams: Both exams are closed-books/notes.

Grade Conversion Table:

Letter Grade A A- B+ B B- C+ C C- D+ D D- F
Range [93,100] [90,93) [87,90) [83,87) [80,83) [77,80) [73,77) [70,73) [67,70) [63,67) [60,63) [0,60)

Academic Honesty

We will strictly follow UGA’s Academic Honesty Policy. Dishonest behavior will not be tolerated and may result into failing the course. Please contact the instructor if you have any concerns regarding this issue.

Class Schedule (Tentative)

Week Date Topic Notes
1 Jan. 9 (W) Course Overview  
Jan. 10 (R) Introduction to Data Science  
2 Jan. 15 (T) Data Collection
Jan. 16 (W) Data Preprocessing
Jan. 17 (R) Data Visualization  
3 Jan. 22 (T) Data Visualization
Jan. 23 (W) Data Visualization HW1 OUT
Jan. 24 (R) Review of Linear Algebra and Statistics  
4 Jan. 29 (T) Python Programming (I) Guest Speaker
Jan. 30 (W) Python Programming (II) Guest Speaker
Jan. 31 (R) Python Libraries for Data Science Guest Speaker
5 Feb. 5 (T) Linear Regression HW1 DUE (11:59 PM)
Feb. 6 (W) Model Selection
Feb. 7 (R) Ridge regression and Lasso  
6 Feb. 12 (T) Basic Classification Models
Feb. 13 (W) Basic Classification Models
Feb. 14 (R) Basic Classification Models HW2 OUT
7 Feb. 19 (T) Basic Classification Models
Feb. 20 (W) Basic Classification Models
Feb. 21 (R) Project Proposal Presentation
8 Feb. 26 (T) Project Proposal Presentation HW2 Due (11:59 PM)
Feb. 27 (W) Midterm Review
Feb. 28 (R) Advanced Classification Models
9 Mar. 5 (T) Midterm HW3 OUT
Mar. 6 (W) Advanced Classification Models
Mar. 7 (R) Advanced Classification Models
10 Mar. 12 (T) Spring Break; No Class
Mar. 13 (W) Spring Break; No Class
Mar. 14 (R) Spring Break; No Class
11 Mar. 19 (T) Advanced Classification Models HW3 DUE (11:59 PM)
Mar. 20 (W) Advanced Classification Models
Mar. 21 (R) Project Progress Review HW4 OUT
12 Mar. 26 (T) Clustering
Mar. 27 (W) Clustering
Mar. 28 (R) Clustering
13 Apr. 2 (T) Clustering HW4 DUE (11:59 PM)
Apr. 3 (W) Dimensionality Reduction
Apr. 4 (R) Dimensionality Reduction HW5 OUT
14 Apr. 9 (T) Dimensionality Reduction
Apr. 10 (W) Feature Selection
Apr. 11 (R) Feature Selection
15 Apr. 16 (T) Neural Networks and Deep Learning HW5 DUE (11:59 PM)
Apr. 17 (W) Neural Networks and Deep Learning
Apr. 18 (R) Neural Networks and Deep Learning
16 Apr. 23 (T) Team Project Presentation (I)
Apr. 24 (W) Team Project Presentation (II)
Apr. 25 (R) Team Project Presentation (III)
17 Apr. 30 (T) Course Review