Compressed Machine Learning on Time Series Data
N. Gocht and F. Finger
Master's Thesis, University of Gothenburg, Jul 2020.
The extent of time related data across many fields has led to substantial interest in the analysis of time series. This interest meets growing challenges to store and process data. While the data is collected at an exponential rate, advancements in processing units are slowing down. Therefore, active research is practiced to find more efficient means of storing and processing data. This can be especially difficult for time series due to their various shapes and scales. In this thesis, we present two variants for optimising a Greedy Clustering algorithm used for lossy time series compression. This study investigates, whether the efficient but lossy compression sufficiently preserves the characteristics of the time series to allow time series prediction and anomaly detection. We suggest two variants for a performance optimization, Greedy SF and Greedy SAX. These algorithms are based on novel lookup methods for cluster candidate selection based on statistical features of time series and extracted SAX substrings. Furthermore, we enabled the clustering to allow processing time series with different value ranges, which allows the compression of time series with various scales. To validate the endto- end pipeline including compression and prediction, a performance evaluation is applied. To further analyse the applicability, a comprehensive benchmark against a pipeline with an autoencoder for compression and a stacked LSTM for prediction is performed.
A reprint is available as PDF.
Further publications by Nathalie Gocht.