CSCI 3360: Data Science I (Spring 2021)

Instructor Sheng Li
E-mail: sheng.li@uga.edu
Time and Location of Lecture M: 4:10 pm - 5:00 pm Online
TR: 3:55 pm - 5:10 pm Online
Instructor Office Hours Monday: 3:00 pm - 4:00 pm or by an email appointment.
TA Office Hours and Location TA: TBA
Time: TBA

Course Description

This course presents a rigorous overview of methods for data mining, image processing, natural language processing, and scientific computing. Core concepts in supervised and unsupervised analytics, dimensionality reduction, deep learning, and data visualization will be explored in depth. Please refer to the syllabus for more information.

Textbooks

The main textbook for this course is:

An Introduction to Statistical Learning with Applications in R” by Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani. Springer.

The PDF version of this book is available on the author's homepage.

Grading

Section Portion Description
Homework 35% 5 individual assignments involving problem solving and programming
Exams 35% Midterm (15%) and Final (20%)
Team Project 25% Project proposal (5%); Progress review (5%); Final presentation and report (15%)
In-class Participation 5%

Homework Submission: Homework should be submitted to the eLC by due date (11:59pm).

Late Submission Policy: Late submissions will be penalized by deducting 10% of the score for each day beyond due time.

Exams: Both exams are closed-books/notes.

Grade Conversion Table:

Letter Grade A A- B+ B B- C+ C C- D+ D D- F
Range [93,100] [90,93) [87,90) [83,87) [80,83) [77,80) [73,77) [70,73) [67,70) [63,67) [60,63) [0,60)

Academic Honesty

We will strictly follow UGA’s Academic Honesty Policy. Dishonest behavior will not be tolerated and may result into failing the course. Please contact the instructor if you have any concerns regarding this issue.

Class Schedule (Tentative)

Week Date Topic Notes
1 Jan. 14 (R) Course Overview  
2 Jan. 18 (M) Holiday; No Class
Jan. 19 (T) Introduction to Data Science  
Jan. 21 (R) Python Programming (I)
3 Jan. 25 (M) Python Programming (II)
Jan. 26 (T) Python Libraries for Data Science  
Jan. 28 (R) Data Collection
4 Feb. 1 (M) Data Preprocessing
Feb. 2 (T) Data Visualization HW1 OUT
Feb. 4 (R) Review of Linear Algebra and Statistics
5 Feb. 8 (M) Linear Regression
Feb. 9 (T) Model Selection
Feb. 11 (R) Ridge regression and Lasso HW1 DUE (11:59 PM)
6 Feb. 15 (M) Basic Classification Models
Feb. 16 (T) Basic Classification Models HW2 OUT
Feb. 18 (R) Basic Classification Models
7 Feb. 22 (M) Basic Classification Models
Feb. 23 (T) Basic Classification Models
Feb. 25 (R) Project Proposal Presentation HW2 Due (11:59 PM)
8 Mar. 1 (M) Project Proposal Presentation
Mar. 2 (T) Advanced Classification Models HW3 OUT
Mar. 4 (R) Midterm Review
9 Mar. 8 (M) Advanced Classification Models
Mar. 9 (T) Midterm
Mar. 11 (R) Advanced Classification Models
10 Mar. 15 (M) Advanced Classification Models HW3 DUE (11:59 PM)
Mar. 16 (T) Advanced Classification Models HW4 OUT
Mar. 18 (R) Clustering
11 Mar. 22 (M) Clustering
Mar. 23 (T) Clustering
Mar. 25 (R) Clustering HW4 DUE (11:59 PM)
12 Mar. 29 (M) Dimensionality Reduction
Mar. 30 (T) Project Progress Review HW5 OUT
Apr. 1 (R) Dimensionality Reduction
13 Apr. 5 (M) Dimensionality Reduction
Apr. 6 (T) Feature Selection
Apr. 8 (R) Instructional Break; No Class
14 Apr. 12 (M) Feature Selection HW5 DUE (11:59 PM)
Apr. 13 (T) Neural Networks and Deep Learning
Apr. 15 (R) Neural Networks and Deep Learning
15 Apr. 19 (M) Neural Networks and Deep Learning
Apr. 20 (T) Neural Networks and Deep Learning
Apr. 22 (R) Team Project Presentation (I)
16 Apr. 26 (M) Team Project Presentation (II)
Apr. 27 (T) Team Project Presentation (III)
Apr. 29 (R) Course Review