Datasets
Visual Features
Other
The Bottom-Up and Top-Down Attention for VQA model is available in multimodal.
You can either train it directly (you will need to clone the repository), or import it in your code.