Prototype-based compression of time series from telecommunication data

G. Alpsten and S. Sabi

Master's Thesis, Chalmers University of Technology, Jun 2019.

This thesis explores a technique and use-cases for compressing time series data by the development of prototypes. The methods explored revolve primarily around the idea that a large group of time series can be represented by a much smaller number of prototypes and the calculated residual values between the time series. We evaluate different clustering techniques to develop prototypes, transform the data by forming residual time series, and explore storage of the transformed dataset to file. This is implemented and compared to two general-purpose compression techniques: Snappy and Zstandard. Our techniques outperform Snappy and Zstandard for nonconstant time series, with significant improvements using an error restricted lossy algorithm we present. This thesis further evaluates the use of the compressed format for the prediction of missing data and discusses applications.

A reprint is available as PDF.

Further publications by Gabriel Alpsten, Sharan Sabi.