Command Palette
Search for a command to run...
Multi-Pass Q-Networks for Deep Reinforcement Learning with Parameterised Action Spaces
Craig J. Bester; Steven D. James; George D. Konidaris

Abstract
Parameterised actions in reinforcement learning are composed of discrete actions with continuous action-parameters. This provides a framework for solving complex domains that require combining high-level actions with flexible control. The recent P-DQN algorithm extends deep Q-networks to learn over such action spaces. However, it treats all action-parameters as a single joint input to the Q-network, invalidating its theoretical foundations. We analyse the issues with this approach and propose a novel method, multi-pass deep Q-networks, or MP-DQN, to address them. We empirically demonstrate that MP-DQN significantly outperforms P-DQN and other previous algorithms in terms of data efficiency and converged policy performance on the Platform, Robot Soccer Goal, and Half Field Offense domains.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| control-with-prametrised-actions-on-half | MP-DQN | Goal Probability: 0.913 |
| control-with-prametrised-actions-on-platform | MP-DQN | Return: 0.987 |
| control-with-prametrised-actions-on-robot | MP-DQN | Goal Probability: 0.789 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.