Implementation:FlagOpen FlagEmbedding MLVU PlotQA Data

Knowledge Sources	FlagOpen_FlagEmbedding
Domains	Video Understanding, Benchmark Data, Question Answering
Last Updated	2026-02-09 00:00 GMT

Overview

Benchmark dataset for plot-based question answering on long videos in the MLVU evaluation framework.

Description

The MLVU PlotQA dataset is part of the MLVU (Multi-modal Long Video Understanding) benchmark, containing multiple-choice questions that test understanding of video plots and narratives. Each entry includes a video reference, duration information, a question about the video content, four candidate answers, and the correct answer. Questions focus on plot elements such as character appearances, actions, events, and visual details that require comprehensive video understanding.

The dataset format is structured JSON with 7009 lines, where each question requires watching and comprehending long-form video content to answer correctly. This tests models' ability to track narrative elements, identify key visual details, and reason about plot progression over extended video sequences.

Usage

Use this dataset for evaluating video understanding models on plot comprehension tasks, benchmarking multi-modal models on long-form video question answering, or training systems for narrative understanding in videos.

Code Reference

Source Location

Repository: FlagOpen_FlagEmbedding
File: research/MLVU/data/1_plotQA.json

Data Structure

{
    "video": str,              # Video filename (e.g., "movie101_66.mp4")
    "duration": int,           # Video duration in seconds
    "question": str,           # Question about video plot
    "candidates": List[str],   # Four candidate answers
    "answer": str,             # Correct answer (one of candidates)
    "question_type": str       # Always "plotQA" for this dataset
}

Import

import json

# Load the dataset
with open("research/MLVU/data/1_plotQA.json", "r") as f:
    data = [json.loads(line) for line in f]

I/O Contract

Inputs

Name	Type	Required	Description
file_path	str	Yes	Path to the JSON data file

Outputs

Field	Type	Description
video	str	Video filename identifier
duration	int	Length of video in seconds
question	str	Question text about the video plot
candidates	List[str]	List of 4 possible answers
answer	str	The correct answer string
question_type	str	Type identifier ("plotQA")

Usage Examples

import json
from typing import List, Dict

# Load PlotQA dataset
def load_plotqa_data(file_path: str) -> List[Dict]:
    with open(file_path, "r") as f:
        return [json.loads(line) for line in f]

data = load_plotqa_data("research/MLVU/data/1_plotQA.json")

# Example entry
example = data[0]
print(f"Video: {example['video']}")
print(f"Duration: {example['duration']}s")
print(f"Question: {example['question']}")
print(f"Candidates: {example['candidates']}")
print(f"Answer: {example['answer']}")

# Output:
# Video: movie101_66.mp4
# Duration: 246s
# Question: What color is the main male character in the video?
# Candidates: ['Yellow', 'Red', 'Green', 'Blue']
# Answer: Yellow

# Evaluate a model
def evaluate_plotqa(model, data: List[Dict]) -> float:
    correct = 0
    total = len(data)

    for item in data:
        video_path = f"videos/{item['video']}"
        question = item['question']
        candidates = item['candidates']
        correct_answer = item['answer']

        # Model prediction (pseudo-code)
        predicted_answer = model.predict(video_path, question, candidates)

        if predicted_answer == correct_answer:
            correct += 1

    accuracy = correct / total
    return accuracy

# Filter by video duration
short_videos = [item for item in data if item['duration'] < 300]  # < 5 minutes
long_videos = [item for item in data if item['duration'] >= 600]  # >= 10 minutes

print(f"Short videos: {len(short_videos)}")
print(f"Long videos: {len(long_videos)}")

# Analyze question types
questions_about_color = [
    item for item in data
    if "color" in item['question'].lower()
]
print(f"Color questions: {len(questions_about_color)}")

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment