Single Action Scene Graph Generation(SA-SGG)

Generating a scene graph for a description that involves a single action, within a scenario

SA-SGG is a task of generating a scene graph for a description that involves a single action, within a scenario. Formally, the task is to generate triplets of \( G_i=(V_i, E_i) \), given the description context \( C^d_i=(d_1, \dots, d_{i-1}) \), a description \( d_i \), and valid nodes and edges of the scenario, denoted as \( V \) and \( E \), respectively.

Example 1

Example 2

Multi Action Scene Graph Generation(MA-SGG)

Generate scene graphs by decomposing actions when descriptions involve multiple action

MA-SGG aims to generate scene graphs by decomposing actions when given complex descriptions that involve multiple actions. The task formulation is identical to that of SA-SGG, except that an additional clue indicating the number of actions is provided, and the complexity of \( d_i \) is greater than 1. This makes MA-SGG more challenging than SA-SGG because the amount of information to process, especially to generate, is larger, and target actions may be implicit in the description. Also, although the number of actions is given, the task still requires the ability to accurately decompose, identify, and order valid actions from the description.

Example 1

Example 2