CSCI 4360/6360: Data Science II (Spring 2024)

Course Information

  • Instructor: Ninghao Liu

  • Course time and location:

    • TR: 12:45 pm - 2:00 pm, Physics 221

    • W: 12:40 pm - 1:30 pm, Boyd 208

  • Office hours: Thursday, 2:00 PM - 3:00 PM

  • Office: Boyd 616

  • TA: TBD

Course Description

The goal of this course is to familiarize students with fundamental topics in data science, data mining, machine learning and deep learning on different types of data including tabular data, text data and graph data, as well as their application in cybersecurity and recommender systems.

Textbooks

The main textbook (non-mandatory) for this course is:

Dive into Deep Learning” by Aston Zhang, Alexander J. Smola, Zachary Lipton, Mu Li.

Other textbooks: “Introduction to Information Retrieval” by Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schutze.

Course Prerequisite

Students are expected to have a working knowledge of Python. All programming assignments must be completed using Python unless it is specified otherwise. Some elementary knowledge of calculus, statistics and linear algebra are expected. Those fundamentals will be provided as they are needed.

Grading

Letter Grade A A- B+ B B- C+ C C- D F
Range [90, 100] [87, 90) [84, 87) [80, 84) [77, 80) [74, 77) [70, 74) [67, 70) [60, 67) [0, 60)

Late Submission Policy: For homework assignments, 20% is deducted for each late day for up to 48 hours (including weekends) after which submissions are not accepted. Late presentation materials and project reports not accepted.

Exams: Both exams are closed-books/notes.

Academic Honesty

We will strictly follow UGA’s Academic Honesty Policy. Dishonest behavior will not be tolerated and may result into failing the course. Please contact the instructor if you have any concerns regarding this issue.

Course Schedule (Tentative)

Week Date Topic Notes
1 01/09 Course Overview
01/10 kNN
01/11 Preliminary: Calculus and Statistics
2 01/16 Preliminary: Python Programming
01/17 Preliminary: Linear Algebra
01/18 Linear models HW1 out
3 01/23 Linear models
01/24 Linear models
01/25 Model evaluation: classification HW1 due
4 01/30 Naive Bayes classifiers HW2 out
01/31 Naive Bayes classifiers
02/01 Data preprocessing
5 02/06 Decision tree
02/07 Decision tree
02/08 Overfitting
6 02/13 Unsupervised learning: clustering
02/14 Unsupervised learning: clustering
02/15 Model evaluation: clustering
7 02/20 Text mining: Preliminaries
02/21 Text mining: Vector space model HW2 due
02/22 Text mining: Vector space model
8 02/27 Graph mining
02/28 Graph mining
02/29 Midterm Exam
9 03/05 - Spring Break. No class.
03/06 - Spring Break. No class.
03/07 - Spring Break. No class.
10 03/12 Deep learning: Multilayer Perceptrons
03/13 Deep learning: Multilayer Perceptrons HW3 out
03/14 Deep learning: Multilayer Perceptrons
11 03/19 Text mining: Embedding
03/20 Text mining: Embedding
03/21 Text mining: RNN
12 03/26 Text mining: RNN
03/27 Large language models
03/28 Large language models HW3 due, HW4 out
13 04/02 Large language models
04/03 Large language models
04/04 LLM Trustworthiness
14 04/09 Model interpretation
04/10 Outlier detection
04/11 Outlier detection
15 04/16 Recommender systems HW4 due
04/17 Recommender systems
04/18 Recommender systems
16 04/23 Presentation
04/24 Presentation
04/25 Q&A
17 05/02 Final Exam