HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

DFAC Framework: Factorizing the Value Function via Quantile Mixture for Multi-Agent Distributional Q-Learning

Wei-Fang Sun Cheng-Kuang Lee Chun-Yi Lee

DFAC Framework: Factorizing the Value Function via Quantile Mixture for Multi-Agent Distributional Q-Learning

Abstract

In fully cooperative multi-agent reinforcement learning (MARL) settings, the environments are highly stochastic due to the partial observability of each agent and the continuously changing policies of the other agents. To address the above issues, we integrate distributional RL and value function factorization methods by proposing a Distributional Value Function Factorization (DFAC) framework to generalize expected value function factorization methods to their DFAC variants. DFAC extends the individual utility functions from deterministic variables to random variables, and models the quantile function of the total return as a quantile mixture. To validate DFAC, we demonstrate DFAC's ability to factorize a simple two-step matrix game with stochastic rewards and perform experiments on all Super Hard tasks of StarCraft Multi-Agent Challenge, showing that DFAC is able to outperform expected value function factorization baselines.

Code Repositories

j3soon/dfac
Official
tf
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
smac-on-smac-27m-vs-30mDMIX
Average Score: 19.43
Median Win Rate: 85.45
smac-on-smac-27m-vs-30mVDN
Average Score: 18.45
Median Win Rate: 63.12
smac-on-smac-27m-vs-30mDIQL
Average Score: 14.45
Median Win Rate: 6.02
smac-on-smac-27m-vs-30mQMIX
Average Score: 19.41
Median Win Rate: 84.77
smac-on-smac-27m-vs-30mDDN
Average Score: 19.71
Median Win Rate: 91.48
smac-on-smac-27m-vs-30mIQL
Average Score: 14.01
Median Win Rate: 2.27
smac-on-smac-3s5z-vs-3s6z-1DIQL
Average Score: 17.52
Median Win Rate: 62.22
smac-on-smac-3s5z-vs-3s6z-1QMIX
Average Score: 20.16
Median Win Rate: 67.22
smac-on-smac-3s5z-vs-3s6z-1DDN
Average Score: 20.94
Median Win Rate: 94.03
smac-on-smac-3s5z-vs-3s6z-1IQL
Average Score: 16.54
Median Win Rate: 29.83
smac-on-smac-3s5z-vs-3s6z-1VDN
Average Score: 19.75
Median Win Rate: 89.2
smac-on-smac-3s5z-vs-3s6z-1DMIX
Average Score: 19.7
Median Win Rate: 91.08
smac-on-smac-6h-vs-8z-1VDN
Average Score: 15.41
Median Win Rate: 0
smac-on-smac-6h-vs-8z-1DDN
Average Score: 19.4
Median Win Rate: 83.92
smac-on-smac-6h-vs-8z-1QMIX
Average Score: 14.37
Median Win Rate: 12.78
smac-on-smac-6h-vs-8z-1DMIX
Average Score: 17.14
Median Win Rate: 49.43
smac-on-smac-6h-vs-8z-1IQL
Average Score: 13.78
Median Win Rate: 0
smac-on-smac-6h-vs-8z-1DIQL
Average Score: 14.94
Median Win Rate: 0.00
smac-on-smac-corridorDIQL
Average Score: 19.68
Median Win Rate: 91.62
smac-on-smac-corridorVDN
Average Score: 19.47
Median Win Rate: 85.34
smac-on-smac-corridorDDN
Average Score: 20
Median Win Rate: 95.4
smac-on-smac-corridorQMIX
Average Score: 15.07
Median Win Rate: 37.61
smac-on-smac-corridorDMIX
Average Score: 19.66
Median Win Rate: 90.45
smac-on-smac-corridorIQL
Average Score: 19.42
Median Win Rate: 84.87
smac-on-smac-def-armored-parallelDMIX
Median Win Rate: 90.0
smac-on-smac-def-armored-parallelDDN
Median Win Rate: 0.0
smac-on-smac-def-armored-parallelDIQL
Median Win Rate: 0.0
smac-on-smac-def-armored-sequentialDDN
Median Win Rate: 71.9
smac-on-smac-def-armored-sequentialDIQL
Median Win Rate: 53.1
smac-on-smac-def-armored-sequentialDMIX
Median Win Rate: 81.3
smac-on-smac-def-infantry-parallelDMIX
Median Win Rate: 90.0
smac-on-smac-def-infantry-parallelDDN
Median Win Rate: 20.0
smac-on-smac-def-infantry-sequentialDIQL
Median Win Rate: 93.8
smac-on-smac-def-infantry-sequentialDDN
Median Win Rate: 90.6
smac-on-smac-def-infantry-sequentialDMIX
Median Win Rate: 100
smac-on-smac-def-outnumbered-parallelDIQL
Median Win Rate: 0.0
smac-on-smac-def-outnumbered-parallelDMIX
Median Win Rate: 5.0
smac-on-smac-def-outnumbered-parallelDDN
Median Win Rate: 0.0
smac-on-smac-def-outnumbered-sequentialDDN
Median Win Rate: 0.0
smac-on-smac-def-outnumbered-sequentialDMIX
Median Win Rate: 0.0
smac-on-smac-def-outnumbered-sequentialDIQL
Median Win Rate: 0.0
smac-on-smac-mmm2-1DIQL
Average Score: 19.21
Median Win Rate: 85.23
smac-on-smac-mmm2-1QMIX
Average Score: 19.42
Median Win Rate: 92.44
smac-on-smac-mmm2-1VDN
Average Score: 19.36
Median Win Rate: 89.2
smac-on-smac-mmm2-1IQL
Average Score: 17.5
Median Win Rate: 68.92
smac-on-smac-mmm2-1DDN
Average Score: 20.9
Median Win Rate: 97.22
smac-on-smac-mmm2-1DMIX
Average Score: 19.87
Median Win Rate: 95.11
smac-on-smac-off-complicated-parallelDMIX
Median Win Rate: 0.0
smac-on-smac-off-complicated-parallelDDN
Median Win Rate: 0.0
smac-on-smac-off-complicated-parallelDIQL
Median Win Rate: 0.0
smac-on-smac-off-distant-parallelDIQL
Median Win Rate: 0.0
smac-on-smac-off-distant-parallelDDN
Median Win Rate: 0.0
smac-on-smac-off-distant-parallelDMIX
Median Win Rate: 0.0
smac-on-smac-off-hard-parallelDDN
Median Win Rate: 0.0
smac-on-smac-off-hard-parallelDIQL
Median Win Rate: 0.0
smac-on-smac-off-hard-parallelDMIX
Median Win Rate: 0.0
smac-on-smac-off-near-parallelDIQL
Median Win Rate: 0.0
smac-on-smac-off-near-parallelDDN
Median Win Rate: 0.0
smac-on-smac-off-near-parallelDMIX
Median Win Rate: 0.0
smac-on-smac-off-superhard-parallelDDN
Median Win Rate: 0.0
smac-on-smac-off-superhard-parallelDIQL
Median Win Rate: 0.0
smac-on-smac-off-superhard-parallelDMIX
Median Win Rate: 0.0

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp