Compressed Linear Algebra

A Big Data Platform Integrating Compressed Linear Algebra with

Columnar Databases

by

Vishnu Gowda Harish

(Under the Direction of John A.Miller)

Abstract

Key foundational components of Big Data frameworks include efficient large-scale storage and high-performance linear algebra. We discuss efficient implementations that utilize compression techniques inspired by columnar relational databases for improving space and time profiles for vector and matrix operations. In addition, linear algebra operations are integrated with columnar relational algebra operations both in dense and compressed forms. For several of the operations substantial speedups are obtained by operating directly on the compressed relations, vectors and matrices. Advantages of mixing and matching relational and linear algebra operations are also pointed out. Both serial and parallel implementations are provided in the ScalaTion Big Data Analytics Framework.

Index words: Data analysis; Data compression, Data mining, Linear algebra, Parallel programming;