TSG-Bench

Scene Graph Description Selection (SGDS)

Reasoning over scene graphs to answer a given question

The goal of this task is to accurately interpret a scene graph within a given context and identify the correct description among distractors. We formulate SGDS as a multiple-choice question problem, consisting of the graph-based context \( C^g_i = (G_1, \dots, G_{i-1}) \), a scene graph \( G_i \), and five candidate descriptions, with one correct answer included. The model should be able to track nodes and edges from \( C^g_i \) and ensure that all elements in \( G_i \) are accurately represented. For SGDS, we use scene graphs representing a single action.

Scen Graph Understanding

Scene Graph Question Answering (SGQA)

Example

Scene Graph Description Selection (SGDS)

Example 1

Example 2