2016 IEEE International Conference on Big Data

A Big Data Platform Integrating Compressed Linear Algebra with Columnar Databases

Vishnu Gowda Harish, Vinay Kumar Bingi and John A. Miller

Department of Computer Science
University of Georgia
Athens, GA, USA

Abstract

Key foundational components of Big Data frameworks include efficient large-scale storage and high-performance linear algebra. This paper discusses efficient implementations that utilize compression techniques inspired by columnar relational databases for improving space and time profiles for vector and matrix operations. In addition, linear algebra operations are integrated with columnar relational algebra operations both in dense and compressed forms. For several of the operations substantial speedups are obtained by operating directly on the compressed relations, vectors and matrices. Advantages of mixing and matching relational and linear algebra operations are also pointed out. Both serial and parallel implementations are provided in the ScalaTion Big Data Analytics Framework.

Keywords - Data analysis; data compression; data mining; linear algebra; parallel programming