The implementation of Medical Visual Question Answering via Conditional Reasoning [ACM MM 2020]
We evaluate our proposal on VQA-RAD dataset.
We propose QCR [Question-Conditioned Reasoning Module] and TCR [Type-Conditioned Reasoning] modules, which guide the importance selection over multimodal fusion features and learn different reasoning skills for different types of tasks separately. The detailed architecture is shown in the figure below.
Overall | Open-ended | Closed-ended | |
---|---|---|---|
Base | 66.1 | 49.2 | 77.2 |
Our proposal | 71.6 | 60.0 | 79.3 |
Our proposal achieves significantly increased accuracy in predicting answers to both closed-ended and open-ended questions, especially for open-ended questions.
pip install -r requirements.txt
All pre-prepared data can be found in data
package
git clone https://github.com/Awenbocc/Med-VQA.git
cd ./Med-VQA
python main.py --gpu 0 --seed 88
MIT License
Please cite following in your publications if they help you