Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Heuristic:Vibrantlabsai Ragas Reasoning Model Parameter Constraints

From Leeroopedia
Knowledge Sources
Domains LLM_Evaluation, Debugging
Last Updated 2026-02-12 10:00 GMT

Overview

Parameter constraint heuristic for OpenAI reasoning models (o-series, GPT-5+): force temperature to 1.0, remove top_p, and map max_tokens to max_completion_tokens.

Description

OpenAI reasoning models (o1, o3, etc.) and newer GPT-5+ models have strict API parameter constraints that differ from standard chat models. They require `temperature=1.0` (the only supported value), do not accept `top_p`, and use `max_completion_tokens` instead of `max_tokens`. Ragas auto-detects these models via pattern matching on the model name and transparently remaps parameters to avoid API errors.

Usage

Use this heuristic when:

  • Using o-series models (o1, o1-mini, o3, etc.) for evaluation — parameters are auto-remapped.
  • Using GPT-5+ models — Same auto-remapping applies.
  • Debugging API errors like "temperature must be 1.0" or "top_p is not supported" — the model is a reasoning model that needs parameter constraints.
  • Wrapping via LangchainLLMWrapper — Set `bypass_temperature=True` and `bypass_n=True` manually for reasoning models.

The Insight (Rule of Thumb)

  • Action: When using reasoning models, Ragas automatically enforces: `temperature=1.0`, removes `top_p`, maps `max_tokens` → `max_completion_tokens`.
  • Value: Temperature forced to exactly `1.0`; no other value accepted by the API.
  • Trade-off: Loss of temperature control for determinism. Reasoning models produce varied outputs regardless, as they use internal chain-of-thought.

Reasoning

OpenAI reasoning models (o1, o3-mini, etc.) use internal chain-of-thought reasoning that is incompatible with temperature sampling. The API enforces `temperature=1.0` and rejects `top_p`. Additionally, these models use a different token budget parameter (`max_completion_tokens`) that includes both reasoning tokens and output tokens. Sending the wrong parameters results in an API error, so Ragas auto-detects the model via string pattern matching and remaps before the API call.

The detection uses a pattern-based approach rather than a hardcoded list to be future-proof. It covers:

  • O-series: `o1` through `o9` (with variants like `o1-mini`, `o3-2025-01-31`)
  • GPT-5+: `gpt-5` through `gpt-19` (with variants)
  • Special: `codex-mini`

Code Evidence

Reasoning model detection from `src/ragas/llms/base.py:872-904`:

def is_reasoning_model(model_str: str) -> bool:
    """Check if model is a reasoning model requiring max_completion_tokens."""
    # O-series reasoning models (o1, o1-mini, o2, o3, ...)
    # TODO: Update to support o10+ when OpenAI releases models beyond o9
    if (
        len(model_str) >= 2
        and model_str[0] == "o"
        and model_str[1] in "123456789"
    ):
        if len(model_str) == 2 or model_str[2] in ("-", "_"):
            return True

    # GPT-5 and newer (gpt-5, gpt-6, ..., gpt-19)
    # TODO: Update to support gpt-20+ when OpenAI releases models beyond gpt-19
    if model_str.startswith("gpt-"):
        version_str = model_str[4:].split("-")[0].split("_")[0]
        try:
            version = int(version_str)
            if 5 <= version <= 19:
                return True
        except ValueError:
            pass

    if model_str == "codex-mini":
        return True
    return False

Parameter remapping from `src/ragas/llms/base.py:908-918`:

# If max_tokens is provided and model requires max_completion_tokens, map it
if requires_max_completion_tokens and "max_tokens" in mapped_args:
    mapped_args["max_completion_tokens"] = mapped_args.pop("max_tokens")

# GPT-5 and o-series models have strict parameter requirements:
# 1. Temperature must be exactly 1.0 (only supported value)
# 2. top_p parameter is not supported and must be removed
if requires_max_completion_tokens:
    mapped_args["temperature"] = 1.0
    mapped_args.pop("top_p", None)

LangchainLLMWrapper bypass flags from `src/ragas/llms/base.py:172-175`:

# Certain LLMs (e.g., OpenAI o1 series) do not support temperature
self.bypass_temperature = bypass_temperature
# Certain reasoning LLMs (e.g., OpenAI o1 series) do not support n parameter
self.bypass_n = bypass_n

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment