2016 IEEE International Conference on Big Data
A Big Data Platform Integrating Compressed Linear Algebra with Columnar Databases
Vishnu Gowda Harish, Vinay Kumar Bingi and John A. Miller
Department of Computer Science
University of Georgia
Athens, GA, USA
Abstract
Key foundational components of Big Data frameworks include efficient large-scale storage and high-performance
linear algebra. This paper discusses efficient implementations that utilize compression techniques inspired
by columnar relational databases for improving space and time profiles for vector and matrix operations.
In addition, linear algebra operations are integrated with columnar relational algebra operations both
in dense and compressed forms. For several of the operations substantial speedups are obtained by operating
directly on the compressed relations, vectors and matrices.
Advantages of mixing and matching relational and linear algebra operations are also pointed out.
Both serial and parallel implementations are provided in the ScalaTion Big Data Analytics Framework.
Keywords - Data analysis; data compression; data mining; linear algebra; parallel programming