Search for a command to run...
Multimodal Transformer Networks for End-to-End Video-Grounded Dialogue Systems