Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Ucbepic Docetl Dataset Debate Baseline

From Leeroopedia


Knowledge Sources
Domains Sample_Data, Data_Processing
Last Updated 2026-02-08 00:00 GMT

Overview

JSON dataset providing baseline analysis output of theme evolution across U.S. presidential debate transcripts, generated without the reduce gleaning optimization in DocETL.

Description

This file contains the results of a theme evolution analysis pipeline run on presidential debate transcripts using a standard (non-gleaning) reduce operation. Each record contains a long-form analytical report examining how Democratic and Republican viewpoints on a specific political theme have evolved over multiple decades, along with the theme label. The reports cover topics such as Experience and Leadership, Race Relations, Family Values, Arms Control, Immigration, and many others. This dataset serves as a baseline for comparison against the gleaning-optimized variant to demonstrate the quality improvements that DocETL's reduce gleaning feature provides.

Usage

This dataset is stored in the example_data/debates directory and is used to demonstrate and compare DocETL pipeline output quality. It is specifically intended as a comparison baseline against the reduce gleaning variant to show how different pipeline configurations affect output quality and completeness.

Code Reference

Source Location

Data Structure

[
  {
    "report": "# Analysis of Democratic and Republican Viewpoints on 'Experience and Leadership' (2000 - 2023)\n\n## Introduction\nThe theme of \"Experience and Leadership\" has been a focal point in American politics...",
    "theme": "Experience and Leadership"
  },
  {
    "report": "# Evolution of Democratic and Republican Viewpoints on \"Character and Experience\" (1992 - 2023)\n\n## Introduction\n...",
    "theme": "Character and Experience"
  }
]

I/O Contract

Schema

Field Type Description
report string Long-form analytical report (Markdown formatted) examining the evolution of Democratic and Republican viewpoints on the theme across multiple election cycles
theme string The political theme being analyzed (e.g., "Experience and Leadership", "Race Relations", "Family Values")

Themes Covered

The dataset contains reports on the following political themes:

  • Experience and Leadership
  • Character and Experience
  • Illegal Immigration
  • Race Relations
  • Central America
  • Family Values
  • Veterans Affairs
  • Infrastructure Development
  • Education and Values
  • Cuba and Foreign Policy
  • Pardon and Amnesty for Draft Evaders
  • Human Rights and Morality in Foreign Policy
  • Campaign Character and Tonality
  • Soviet Union
  • Peaceful Transfer of Power and January 6
  • And additional themes

Usage Examples

import json

with open("example_data/debates/theme_evolution_analysis_baseline.json") as f:
    data = json.load(f)
# data is a list of theme analysis records with fields: report, theme
print(f"Total theme analyses: {len(data)}")
for record in data:
    print(f"Theme: {record['theme']}")

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment