3 months ago

Pure Transformers are Powerful Graph Learners

Jinwoo Kim Tien Dat Nguyen Seonwoo Min Sungjun Cho Moontae Lee Honglak Lee Seunghoon Hong

Abstract

We show that standard Transformers without graph-specific modifications can lead to promising results in graph learning both in theory and practice. Given a graph, we simply treat all nodes and edges as independent tokens, augment them with token embeddings, and feed them to a Transformer. With an appropriate choice of token embeddings, we prove that this approach is theoretically at least as expressive as an invariant graph network (2-IGN) composed of equivariant linear layers, which is already more expressive than all message-passing Graph Neural Networks (GNN). When trained on a large-scale graph dataset (PCQM4Mv2), our method coined Tokenized Graph Transformer (TokenGT) achieves significantly better results compared to GNN baselines and competitive results compared to Transformer variants with sophisticated graph-specific inductive bias. Our implementation is available at https://github.com/jw9730/tokengt.

Code Repositories

luis-mueller/wl-transformers

pytorch

Mentioned in GitHub

jw9730/tokengt

Official

pytorch

Mentioned in GitHub

Benchmarks

Benchmark	Methodology	Metrics
graph-classification-on-dd	TokenGT	Accuracy: 73.950±3.361
graph-classification-on-imdb-b	TokenGT	Accuracy: 80.250±3.304
graph-classification-on-nci1	TokenGT	Accuracy: 76.740±2.054
graph-classification-on-nci109	TokenGT	Accuracy: 72.077±1.883
graph-regression-on-esr2	TokenGT	R2: 0.641±0.000 RMSE: 0.529±0.641
graph-regression-on-f2	TokenGT	R2: 0.872±0.000 RMSE: 0.363±0.872
graph-regression-on-kit	TokenGT	R2: 0.800±0.000 RMSE: 0.486±0.800
graph-regression-on-lipophilicity	TokenGT	R2: 0.545±0.024 RMSE: 0.852±0.023
graph-regression-on-parp1	TokenGT	R2: 0.907±0.000 RMSE: 0.383±0.907
graph-regression-on-pcqm4mv2-lsc	TokenGT	Test MAE: 0.0919 Validation MAE: 0.0910
graph-regression-on-peptides-struct	TokenGT	MAE: 0.2489±0.0013
graph-regression-on-pgr	TokenGT	R2: 0.684±0.000 RMSE: 0.543±0.684
graph-regression-on-zinc-full	TokenGT	Test MAE: 0.047±0.010
molecular-property-prediction-on-esol	TokenGT	R2: 0.892±0.036 RMSE: 0.667±0.103
molecular-property-prediction-on-freesolv	TokenGT	R2: 0.930±0.018 RMSE: 1.038±0.125

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette