Source: A list of python packages for time series analysis | by Iurii Katser | Oct, 2024 | Medium
Iurii Katser
Oct 4, 2024
In this article, I will discuss the main tasks encountered when working with time series, as well as which python libraries and packages are best suited for solving these tasks. The libraries are presented with the number of stars from GitHub as of 04.06.24 (stars demonstrate popularity, but not necessarily usefulness or quality).
Forecasting
Searching for a function that predicts values at the forecasting horizon that are close to the actual series values at those moments in time.
- [17,9k stars] https://github.com/facebook/prophet
- [9,6k stars] https://github.com/statsmodels/statsmodels
- [7,5k stars] https://github.com/alan-turing-institute/sktime
- [7,4k stars] https://github.com/unit8co/darts
- [4,8k stars] https://github.com/facebookresearch/Kats
- [4,7k stars] https://github.com/thuml/Time-Series-Library
- [3,7k stars] https://github.com/jdb78/pytorch-forecasting
- [3,7k stars]https://github.com/Nixtla/statsforecast
- [3,3k stars] https://github.com/salesforce/Merlion
- [1,8k stars] https://github.com/linkedin/greykite
- [1k stars] https://github.com/JoaquinAmatRodrigo/skforecast
- [840 stars] https://github.com/etna-team/etna
- [610 stars] https://github.com/aimclub/FEDOT
Classification
Assigning a time series to one of the predefined categories or classes based on the characteristics of the time series.
- [7,5k stars] https://github.com/alan-turing-institute/sktime
- [4,7k stars] https://github.com/thuml/Time-Series-Library
- [2,8k stars] https://github.com/tslearn-team/tslearn/
- [1,7k stars] https://github.com/johannfaouzi/pyts
- [1,5k stars] https://github.com/hfawaz/dl-4-tsc
- [840 stars] https://github.com/tinkoff-ai/etna
Clustering
Dividing the original set of different time series into groups so that the differences between the characteristics of the time series within a group are minimal, and the differences between groups are maximal. Unlike classification, there are no predefined classes.
- [7,5k stars] https://github.com/alan-turing-institute/sktime
- [2,8k stars] https://github.com/tslearn-team/tslearn/
Pattern detection
Searching for a specific pattern (a specific sequence) in a longer time series. This task is very similar to clustering and classification.
- [3,1k stars] https://github.com/TDAmeritrade/stumpy
Aggregation (feature extraction)
Extraction of various statistical characteristics and parameters from a time series. We aggregate when we need to transform a time series into a classical tabular data format, where each row represents an independent data point.
- [8,2k stars] https://github.com/blue-yonder/tsfresh
- [4,8k stars] https://github.com/facebookresearch/Kats
- [800 stars] https://github.com/fraunhoferportugal/tsfel
- [370 stars] https://github.com/predict-idlab/tsflex
Сhangepoint detection
Detecting a specific point of process change (changepoint), for example, where a collective anomaly begins (or ends). Details are in my previous article. There are offline and online problems. For offline (also known as segmentation), it is important to detect the changepoints in an optimal manner. For online, it is important to detect changepoints with minimal delay.
- [1,5k stars] https://github.com/deepcharles/ruptures
- [17,9k stars] https://github.com/facebook/prophet
- [4,8k stars] https://github.com/facebookresearch/Kats
- [4,7k stars] https://github.com/thuml/Time-Series-Library
- [3,3k stars] https://github.com/salesforce/Merlion
- [2,1k stars] https://github.com/SeldonIO/alibi-detect
- [1,8k stars] https://github.com/linkedin/greykite
- [1,2k stars] https://github.com/linkedin/luminol
- [1k stars] https://github.com/arundo/adtk
Anomaly detection (outlier detection)
Finding individual points (outliers) that do not correspond to the general distribution, or assigning each point to a normal or abnormal class. Can be formulated as a classification or clustering problem.
- [8k stars] https://github.com/yzhao062/pyod
- [1,3 stars] https://github.com/datamllab/tods
- [840 stars] https://github.com/tinkoff-ai/etna
- [750 stars] https://github.com/zillow/luminaire/
- [220 stars] https://github.com/selimfirat/pysad
Augmentation and data generation
Augmentation involves expanding the dataset by artificially creating or synthesizing new data to cover unexplored input space. There are a couple of good reviews (one, two) of time series augmentation methods.
- [4,8k stars] https://github.com/timeseriesAI/tsai
- [630 stars] https://github.com/ratschlab/RGAN
- [330 stars] https://github.com/arundo/tsaug
- [330 stars] https://github.com/TimeSynth/TimeSynth
- [320 stars] https://github.com/uchidalab/time_series_augmentation
Conclusion
I created a convenient and visual representation of the listed libraries:
There are quite a few libraries. Some of them lack functionality and need refinement because they are developed by independent teams. Some are limited in customization, and some have a rather inconvenient interface. Some libraries’ functionalities are duplicated, sometimes almost entirely. But despite all the shortcomings, most of the libraries greatly simplify the work and free up valuable time.
If you have any comments on the listed libraries or suggestions for expanding the list, I would be happy to read them in the comments.