M: Introduction to Computational Data Science Using ScalaTion
John A. Miller, 2020 (see the August 16, 2022 version).H: Knowledge Graphs, Hogan et al., 2021.
B: Demystifying Graph Databases: Analysis and Taxonomy of Data Organization, System Designs, and Graph Queries, Besta et al. (revised 2021).
LVB: Principles of Database Management: The Practical Guide to Storing, Managing and Analyzing Big and Small Data,
By Lemahieu, Wilfried; Vanden Broucke, Seppe; Baesens, Bart, 2018.EN: Fundamentals of Database Systems, 7th Edition,
Ramez Elmasri and Shamkant B. Navathe, 2016.Kutner: Applied Regression Analysis,
Chapter 13: Introduction to Nonlinear Regression and Neural Networks,
Kutner and Nachtsheim and Neter, 2016.
Class Time
Day Period 8 Period 76 4:10 am-5:00 pm 3:55 am-5:10 pm Tuesday no yes Wednesday yes no Thursday no yes Room Boyd 306 Boyd 306
Course Description
This is an advanced course on database systems and related information technology. Topics vary year to year.Current Focus: Graph Databases, Knowledge Graphs, Machine Learning Related to Graphs
12 Topics for Final Exam
- Graph Database Model - Labeled Property Graph (LPG)
- Graph Database Query Languages
- Graph Algebra
- Graph Database Query Processing
- Knowledge Graphs from RDF
- Knowledge Graphs from LPG
- Graph Algorithms - e.g., Neo4j's Graph Data Science Library
- Neural Networks
- Convolutional Networks
- Knowledge Graph Completion
- Graph Embedding
- Graph Neural Networks
Course Topics
Topic Text URL NoSQL M: Ch. 4, EN: Ch. 24 NoSQL -> Columnar Databases M: Ch. 4, EN: 24.6 C-Store -> Graph Databases M: Ch. 5, EN: 24.5 Neo4j --> Graph Database Literatures GDBMS -> Document Databases EN: 24.3 MongoDB Parallel and Distributed Databases EN: Ch. 23 Massively Parallel Databases and MapReduce Systems Big Data EN: Ch. 25 Big Data: A Survey -> Hadoop EN: 25.2-5 Apache Hadoop -> Spark . Apache Spark -> ScalaTion . ScalaTion Project Analytics/Data Mining M: Chs: 8-13, EN: Ch. 28 . -> Nonlinear Regression and Neural Networks M: Ch. 11, EN: 28.5, Kutner Chapter 13 analytics -> Forecasting M: Ch. 12, EN: 28.5 forecaster -> Classification M: Chs. 8, 9, EN: 28.3 classifier -> Clustering M: Ch. 13, EN: 28.3 clusterer
Potential Topics from Top-Tier Research Conferences
- SIGMOD / SIGMOD Archive , ICDE / ICDE Archive , VLDB / VLDB Archive , ISWC / ISWC Archive , ICWS / ICWS Archive , SCC / SCC Archive , BigData Congress / BigData Congress Archive , BigData Conference / BigData Conference Archive .
Additional Notes
- Conflict Serializability
- View Serializability
- Semantic Web
- The Elements of Statistical Learning
- An Introduction to Statistical Learning
Research Paper: format and target for a particular research conference.
Weight Item Due Date 20% Final 12/? 10% Homework see below 20% Group Programs see below 10% Group Lecture see below 40% Group Project see below -- 10% -- 25 min. Presentation . -- 10% -- 8 min. Demo . -- 20% -- Research Paper .
(Subject to Change)
Number Name Description Due Date 1 HW-1 M: section 4.5 Exercises - all Thur 8/27 2 HW-2 M: section 5.5 Exercises - all Thur 9/3 3 HW-3 . . 4 HW-4 . . 5 HW-5 . . 6 HW-6 . . 7 HW-7 . . 8 HW-8 . . 9 HW-9 . . Each student should present one homework solution to the class.
Programs (by Group)
Program Description Restrictions Due Date PG1 Finish Coding of (1) Table, (2) LTable, (3) VTable, (4) GTable or (5) PGraph. See Appendix C.5 pages 679-680 in M: Textbook. . . PG2 Test Efficiency of Query Processing for your software chosen in PG1. Compare with Neo4j and MySQL. . . Each group must demo and submit each programming assignment (e-mail zip file to jam@cs.uga.edu).
Coded in Scala 3.2.0-RC3+, requires Java 17+ or an approved JVM-based language.
Simulation, Optimization and Analytics Using ScalaTion 2.0
See Code Samples
Student Lectures (by Group)
Each group should provide lecture material (ideally via a Web page). Each group will give three lectures with all members participating. Two goals: (i) teach the class about an important research area and (ii) provide background information for your term project. Each group must develop one homework problem on the material they teach that will help the students study for the Final. URLs for lecture notes, tutorial paper, research paper and homework problem should be ready before the group's first lecture.
Group Topical Area Topic Example Tutorial Paper (pdf) Research Paper (pdf) Lecture Notes (pdf) HW Problem Lecture Dates G1 . . . . . . . . G2 . . . . . . . . G3 . . . . . . . . G4 . . . . . . . . G5 . . . . . . . . See GDBMS for example papers.
- Tuesday Lecture - on Tutorial/Survey Paper; Assign Homework Problem
- Wednesday Lecture - on Research Paper, which should be readable, closely related to the topic of Term Project.
- Thursday Lecture - on your Research Plan; Homework Presented
Term Projects (by Group)
Research paper title, abstract and target conference due before group lecture. Research paper due 11/?.
Group Topical Area Research Paper Title Abstract Target Conference Presentation Date G1 . . . . . G2 . . . . . G3 . . . . . G4 . . . . . G5 . . . . .
- Late Points - 10 points off per day late.
- Make-Up Tests - requires (i) written pre-approval for travel/time conflicts or (ii) written explanation for illness.
- A Culture of Honesty -- Examples.
- Copyright Issues -- Regents Guide to Understanding Copyright.
