A list of python packages for time series analysis | by Iurii Katser | Oct, 2024 | Medium

Source: A list of python packages for time series analysis | by Iurii Katser | Oct, 2024 | Medium

Iurii Katser
Oct 4, 2024

In this article, I will discuss the main tasks encountered when working with time series, as well as which python libraries and packages are best suited for solving these tasks. The libraries are presented with the number of stars from GitHub as of 04.06.24 (stars demonstrate popularity, but not necessarily usefulness or quality).

Forecasting

Searching for a function that predicts values at the forecasting horizon that are close to the actual series values at those moments in time.

Illustration of time series forecasting

Classification

Assigning a time series to one of the predefined categories or classes based on the characteristics of the time series.

Illustration of time series classification

Clustering

Dividing the original set of different time series into groups so that the differences between the characteristics of the time series within a group are minimal, and the differences between groups are maximal. Unlike classification, there are no predefined classes.

Illustration of time series clustering

Pattern detection

Searching for a specific pattern (a specific sequence) in a longer time series. This task is very similar to clustering and classification.

Illustration of pattern detection in time series

Aggregation (feature extraction)

Extraction of various statistical characteristics and parameters from a time series. We aggregate when we need to transform a time series into a classical tabular data format, where each row represents an independent data point.

Illustration of time series aggregation

Сhangepoint detection

Detecting a specific point of process change (changepoint), for example, where a collective anomaly begins (or ends). Details are in my previous article. There are offline and online problems. For offline (also known as segmentation), it is important to detect the changepoints in an optimal manner. For online, it is important to detect changepoints with minimal delay.

Illustration of time series changepoint detection

Anomaly detection (outlier detection)

Finding individual points (outliers) that do not correspond to the general distribution, or assigning each point to a normal or abnormal class. Can be formulated as a classification or clustering problem.

Illustration of time series outlier detection

Augmentation and data generation

Augmentation involves expanding the dataset by artificially creating or synthesizing new data to cover unexplored input space. There are a couple of good reviews (onetwo) of time series augmentation methods.

Illustration of time series augmentation

Conclusion

I created a convenient and visual representation of the listed libraries:

File in the best quality (pdf) is available by the link

There are quite a few libraries. Some of them lack functionality and need refinement because they are developed by independent teams. Some are limited in customization, and some have a rather inconvenient interface. Some libraries’ functionalities are duplicated, sometimes almost entirely. But despite all the shortcomings, most of the libraries greatly simplify the work and free up valuable time.

If you have any comments on the listed libraries or suggestions for expanding the list, I would be happy to read them in the comments.

Leave a Reply

The maximum upload file size: 500 MB. You can upload: image, audio, video, document, spreadsheet, interactive, other. Links to YouTube, Facebook, Twitter and other services inserted in the comment text will be automatically embedded. Drop file here