Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Hiyouga LLaMA Factory V1 Rendering Plugin

From Leeroopedia


Knowledge Sources
Domains Machine Learning, Natural Language Processing
Last Updated 2026-02-06 19:00 GMT

Overview

RenderingPlugin is a plugin class that handles rendering chat messages into tokenized model inputs and parsing generated text back into structured messages, with a concrete implementation for the Qwen3 no-think template.

Description

The RenderingPlugin extends BasePlugin to provide two core methods: render_messages (converts structured messages into tokenized ModelInput with input IDs, labels, attention masks, and loss weights) and parse_messages (converts raw generated text back into Message objects). The module registers the qwen3_nothink plugin variant which formats messages using Qwen3-specific special tokens (<|im_start|>, <|im_end|>), handles tool definitions via <tools> XML tags, tool calls via <tool_call> tags, and reasoning blocks via <thinking> tags. The helper function _update_model_input tokenizes accumulated text segments and extends the model input arrays with appropriate labels and loss weights, using IGNORE_INDEX for tokens that should not contribute to the training loss.

Usage

Use RenderingPlugin when you need to render chat messages into model-ready tokenized inputs for training or inference with Qwen3 models using the no-think template. The plugin is typically invoked by the rendering system via its registered name qwen3_nothink rather than directly.

Code Reference

Source Location

Signature

class RenderingPlugin(BasePlugin):
    def render_messages(
        self,
        processor: Processor,
        messages: list[Message],
        tools: str | None = None,
        is_generate: bool = False,
    ) -> ModelInput: ...

    def parse_messages(self, generated_text: str) -> Message: ...

def _update_model_input(
    processor: Processor,
    input_ids: list[int],
    labels: list[int],
    loss_weights: list[int],
    temp_str: str,
    temp_weight: float,
) -> str: ...

@RenderingPlugin("qwen3_nothink").register("render_messages")
def render_qwen3_nothink_messages(
    processor: Processor,
    messages: list[Message],
    tools: str | None = None,
    is_generate: bool = False,
) -> ModelInput: ...

@RenderingPlugin("qwen3_nothink").register("parse_message")
def parse_qwen3_nothink_message(generated_text: str) -> Message: ...

Import

from llamafactory.v1.plugins.model_plugins.rendering import RenderingPlugin

I/O Contract

Inputs

Name Type Required Description
processor Processor Yes Tokenizer or processor instance used to encode text into token IDs
messages list[Message] Yes List of structured chat messages with role, content, and optional loss_weight
tools str or None No JSON string defining available tools for tool-calling scenarios
is_generate bool No If True, appends the assistant prompt prefix for generation mode
generated_text (parse) str Yes Raw generated text string to be parsed back into a Message

Outputs

Name Type Description
render_messages ModelInput Dictionary containing input_ids, attention_mask, labels, and loss_weights lists
parse_messages Message Structured message with role "assistant" and parsed content items (text, reasoning, tool_call)

Usage Examples

from llamafactory.v1.plugins.model_plugins.rendering import RenderingPlugin

# Render messages for training
plugin = RenderingPlugin("qwen3_nothink")
messages = [
    {"role": "user", "content": [{"type": "text", "value": "Hello!"}], "loss_weight": 0.0},
    {"role": "assistant", "content": [{"type": "text", "value": "Hi there!"}], "loss_weight": 1.0},
]
model_input = plugin.render_messages(processor=tokenizer, messages=messages)

# Render messages for generation (adds assistant prefix)
model_input = plugin.render_messages(processor=tokenizer, messages=messages, is_generate=True)

# Parse generated text back into a message
generated = "Let me think.\n<tool_call>\n{\"name\": \"search\", \"arguments\": {\"q\": \"test\"}}\n</tool_call>"
message = plugin.parse_messages(generated)
# message.content includes text and tool_call items

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment