Visual Explainable Artificial Intelligence for Graph-based Visual Question Answering and Scene Graph Curation
This study presents a novel visualization approach to explainable artificial intelligence for graph-based visual question answering (VQA) systems. The method focuses on identifying false answer predictions by the model and offers users the opportunity to directly correct mistakes in the input space, thus facilitating dataset curation. The decision-making process of the model is demonstrated by highlighting certain internal states of a graph neural network (GNN). The proposed system is built on top of a GraphVQA framework that implements various GNN-based models for VQA trained on the GQA dataset. The authors evaluated their tool through the demonstration of identified use cases, quantitative measures, and a user study conducted with experts from machine learning, visualization, and natural language processing domains. The authors’ findings highlight the prominence of their implemented features in supporting the users with incorrect prediction identification and identifying the underlying issues. Additionally, their approach is easily extendable to similar models aiming at graph-based question answering.
Project – Article - PDF – GitHub - Source code (DaRUS) - Model Parameters and Evaluation Data (DaRUS)
@article{vqa2024, author = {Künzel, Sebastian and Munz-Körner, Tanja and Tilli, Pascal and Schäfer, Noel and Vidyapu, Sandeep and Vu, Ngoc Thang and Weiskopf, Daniel}, title = {Visual Explainable Artificial Intelligence for Graph-based Visual Question Answering and Scene Graph Curation}, year = {2025}, journal = {Visual Computing for Industry, Biomedicine, and Art}, publisher = {Springer}, volume = {8}, number = {1}, pages = {9}, doi = {10.1186/s42492-025-00185-y} } Download BibTeX