Command Palette
Search for a command to run...
Open-MAGVIT2: An Open-Source Project Toward Democratizing Auto-regressive Visual Generation
Zhuoyan Luo Fengyuan Shi Yixiao Ge Yujiu Yang Limin Wang Ying Shan

Abstract
We present Open-MAGVIT2, a family of auto-regressive image generation modelsranging from 300M to 1.5B. The Open-MAGVIT2 project produces an open-sourcereplication of Google's MAGVIT-v2 tokenizer, a tokenizer with a super-largecodebook (i.e., 2^{18} codes), and achieves the state-of-the-artreconstruction performance (1.17 rFID) on ImageNet 256 times 256.Furthermore, we explore its application in plain auto-regressive models andvalidate scalability properties. To assist auto-regressive models in predictingwith a super-large vocabulary, we factorize it into two sub-vocabulary ofdifferent sizes by asymmetric token factorization, and further introduce "nextsub-token prediction" to enhance sub-token interaction for better generationquality. We release all models and codes to foster innovation and creativity inthe field of auto-regressive visual generation.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| image-generation-on-imagenet-256x256 | Open-MAGVIT2-XL | FID: 2.33 |
| image-reconstruction-on-imagenet | Open-Magvit2 (16x16) | FID: 1.17 PSNR: 21.90 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.