# CSCI 4360/6360: Data Science II (Spring 2024)

## Course Information

Instructor: Ninghao Liu

Course time and location:

TR: 12:45 pm - 2:00 pm, Physics 221

W: 12:40 pm - 1:30 pm, Boyd 208

Office hours: Thursday, 2:00 PM - 3:00 PM

Office: Boyd 616

TA: TBD

## Course Description

The goal of this course is to familiarize students with fundamental topics in data science, data mining, machine learning and deep learning on different types of data including tabular data, text data and graph data, as well as their application in cybersecurity and recommender systems.

## Textbooks

The main textbook (non-mandatory) for this course is:

“**Dive into Deep Learning**” by Aston Zhang, Alexander J. Smola, Zachary Lipton, Mu Li.

Other textbooks: “Introduction to Information Retrieval” by Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schutze.

## Course Prerequisite

Students are expected to have a working knowledge of Python. All programming assignments must be completed using Python unless it is specified otherwise. Some elementary knowledge of calculus, statistics and linear algebra are expected. Those fundamentals will be provided as they are needed.

## Grading

Letter Grade | A | A- | B+ | B | B- | C+ | C | C- | D | F |

Range | [90, 100] | [87, 90) | [84, 87) | [80, 84) | [77, 80) | [74, 77) | [70, 74) | [67, 70) | [60, 67) | [0, 60) |

**Late Submission Policy**: For homework assignments, 20% is deducted for each late day for up to 48 hours (including weekends) after which submissions are not accepted. Late presentation materials and project reports not accepted.

**Exams**: Both exams are closed-books/notes.

## Academic Honesty

We will strictly follow UGA’s Academic Honesty Policy. Dishonest behavior will not be tolerated and may result into failing the course. Please contact the instructor if you have any concerns regarding this issue.

## Course Schedule (Tentative)

Week | Date | Topic | Notes |

1 | 01/09 | Course Overview | |

01/10 | kNN | ||

01/11 | Preliminary: Calculus and Statistics | ||

2 | 01/16 | Preliminary: Python Programming | |

01/17 | Preliminary: Linear Algebra | ||

01/18 | Linear models | HW1 out | |

3 | 01/23 | Linear models | |

01/24 | Linear models | ||

01/25 | Model evaluation: classification | HW1 due | |

4 | 01/30 | Naive Bayes classifiers | HW2 out |

01/31 | Naive Bayes classifiers | ||

02/01 | Data preprocessing | ||

5 | 02/06 | Decision tree | |

02/07 | Decision tree | ||

02/08 | Overfitting | ||

6 | 02/13 | Unsupervised learning: clustering | |

02/14 | Unsupervised learning: clustering | ||

02/15 | Model evaluation: clustering | ||

7 | 02/20 | Text mining: Preliminaries | |

02/21 | Text mining: Vector space model | HW2 due | |

02/22 | Text mining: Vector space model | ||

8 | 02/27 | Graph mining | |

02/28 | Graph mining | ||

02/29 | Midterm Exam | ||

9 | 03/05 | - | Spring Break. No class. |

03/06 | - | Spring Break. No class. | |

03/07 | - | Spring Break. No class. | |

10 | 03/12 | Deep learning: Multilayer Perceptrons | |

03/13 | Deep learning: Multilayer Perceptrons | HW3 out | |

03/14 | Deep learning: Multilayer Perceptrons | ||

11 | 03/19 | Text mining: Embedding | |

03/20 | Text mining: Embedding | ||

03/21 | Text mining: RNN | ||

12 | 03/26 | Text mining: RNN | |

03/27 | Large language models | ||

03/28 | Large language models | HW3 due, HW4 out | |

13 | 04/02 | Large language models | |

04/03 | Large language models | ||

04/04 | LLM Trustworthiness | ||

14 | 04/09 | Model interpretation | |

04/10 | Outlier detection | ||

04/11 | Outlier detection | ||

15 | 04/16 | Recommender systems | HW4 due |

04/17 | Recommender systems | ||

04/18 | Recommender systems | ||

16 | 04/23 | Presentation | |

04/24 | Presentation | ||

04/25 | Q&A | ||

17 | 05/02 | Final Exam | |