Expert Knowledge-Aware Image Difference Graph Representation Learning for Difference-Aware Medical Visual Question Answering
View/ Open
Date
2023-08-04Author
Hu, Xinyue
Gu, Lin
Qiyuan, An
Zhang, Mengliang
Liangchen, Liu
Kazuma, Kobayashi
Tatsuya, Harada
Summers, M. Ronald
Zhu, Yingying
Metadata
Show full item recordAbstract
To contribute to automating the medical vision-language model, we
propose a novel Chest-Xray Difference Visual Question Answering
(VQA) task. Given a pair of main and reference images, this task
attempts to answer several questions on both diseases and, more
importantly, the differences between them. This is consistent with
the radiologist’s diagnosis practice that compares the current image
with the reference before concluding the report. We collect a new
dataset, namely MIMIC-Diff-VQA, including 700,703 QA pairs from
164,324 pairs of main and reference images. Compared to existing
medical VQA datasets, our questions are tailored to the Assessment Diagnosis-Intervention-Evaluation treatment procedure used by
clinical professionals. Meanwhile, we also propose a novel expert
knowledge-aware graph representation learning model to address
this task. The proposed baseline model leverages expert knowledge
such as anatomical structure prior, semantic, and spatial knowledge
to construct a multi-relationship graph, representing the image
differences between two images for the image difference VQA task.
The dataset and code can be found at https://github.com/Holipori/
MIMIC-Diff-VQA. We believe this work would further push forward
the medical vision language model.