3 months ago

A Time Series is Worth 64 Words: Long-term Forecasting with Transformers

Yuqi Nie Nam H. Nguyen Phanwadee Sinthong Jayant Kalagnanam

Abstract

We propose an efficient design of Transformer-based models for multivariate time series forecasting and self-supervised representation learning. It is based on two key components: (i) segmentation of time series into subseries-level patches which are served as input tokens to Transformer; (ii) channel-independence where each channel contains a single univariate time series that shares the same embedding and Transformer weights across all the series. Patching design naturally has three-fold benefit: local semantic information is retained in the embedding; computation and memory usage of the attention maps are quadratically reduced given the same look-back window; and the model can attend longer history. Our channel-independent patch time series Transformer (PatchTST) can improve the long-term forecasting accuracy significantly when compared with that of SOTA Transformer-based models. We also apply our model to self-supervised pre-training tasks and attain excellent fine-tuning performance, which outperforms supervised training on large datasets. Transferring of masked pre-trained representation on one dataset to others also produces SOTA forecasting accuracy. Code is available at: https://github.com/yuqinie98/PatchTST.

Code Repositories

etna-team/etna

pytorch

MindCode-4/code-2/tree/main/patchtst

mindspore

romilbert/samformer

Mentioned in GitHub

arclab-mit/sw-driver-forecaster

pytorch

Mentioned in GitHub

yuqinie98/patchtst

Official

pytorch

Mentioned in GitHub

timeseriesAI/tsai

pytorch

Mentioned in GitHub

thuml/iTransformer

pytorch

Mentioned in GitHub

WenjieDu/PyPOTS

pytorch

Benchmarks

Benchmark	Methodology	Metrics
time-series-forecasting-on-electricity-192	PatchTST/64	MSE: 0.147
time-series-forecasting-on-electricity-336	PatchTST/64	MSE: 0.163
time-series-forecasting-on-electricity-720	PatchTST/64	MSE: 0.197
time-series-forecasting-on-electricity-96	PatchTST/64	MSE: 0.129
time-series-forecasting-on-etth1-192-1	PatchTST/64	MAE: 0.429 MSE: 0.413
time-series-forecasting-on-etth1-192-2	PatchTST/64	MAE: 0.215 MSE: 0.074
time-series-forecasting-on-etth1-336-1	PatchTST/64	MAE: 0.44 MSE: 0.422
time-series-forecasting-on-etth1-336-2	PatchTST/64	MAE: 0.22 MSE: 0.076
time-series-forecasting-on-etth1-720-1	PatchTST/64	MAE: 0.468 MSE: 0.447
time-series-forecasting-on-etth1-720-2	PatchTST/64	MAE: 0.236 MSE: 0.087
time-series-forecasting-on-etth1-96-1	PatchTST/64	MAE: 0.4 MSE: 0.37
time-series-forecasting-on-etth1-96-2	PatchTST/64	MAE: 0.189 MSE: 0.059
time-series-forecasting-on-etth2-192-1	PatchTST/64	MAE: 0.382 MSE: 0.341
time-series-forecasting-on-etth2-192-2	PatchTST/64	MAE: 0.329 MSE: 0.171
time-series-forecasting-on-etth2-336-1	PatchTST/64	MAE: 0.384 MSE: 0.329
time-series-forecasting-on-etth2-336-2	PatchTST/64	MAE: 0.336 MSE: 0.171
time-series-forecasting-on-etth2-720-1	PatchTST/64	MAE: 0.422 MSE: 0.379
time-series-forecasting-on-etth2-720-2	PatchTST/64	MAE: 0.38 MSE: 0.223
time-series-forecasting-on-etth2-96-1	PatchTST/64	MAE: 0.337 MSE: 0.274
time-series-forecasting-on-etth2-96-2	PatchTST/64	MAE: 0.284 MSE: 0.131
time-series-forecasting-on-weather-192	PatchTST/64	MSE: 0.194
time-series-forecasting-on-weather-336	PatchTST/64	MSE: 0.245
time-series-forecasting-on-weather-720	PatchTST/64	MSE: 0.314
time-series-forecasting-on-weather-96	PatchTST/64	MSE: 0.149

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

A Time Series is Worth 64 Words: Long-term Forecasting with Transformers

Yuqi Nie Nam H. Nguyen Phanwadee Sinthong Jayant Kalagnanam

Abstract

Code Repositories

Benchmarks

Build AI with AI

Hyper Newsletters