Implementation:Hiyouga LLaMA Factory V1 Rendering Plugin

Knowledge Sources	Hiyouga_LLaMA_Factory
Domains	Machine Learning, Natural Language Processing
Last Updated	2026-02-06 19:00 GMT

Overview

RenderingPlugin is a plugin class that handles rendering chat messages into tokenized model inputs and parsing generated text back into structured messages, with a concrete implementation for the Qwen3 no-think template.

Description

The RenderingPlugin extends BasePlugin to provide two core methods: render_messages (converts structured messages into tokenized ModelInput with input IDs, labels, attention masks, and loss weights) and parse_messages (converts raw generated text back into Message objects). The module registers the qwen3_nothink plugin variant which formats messages using Qwen3-specific special tokens (<|im_start|>, <|im_end|>), handles tool definitions via <tools> XML tags, tool calls via <tool_call> tags, and reasoning blocks via <thinking> tags. The helper function _update_model_input tokenizes accumulated text segments and extends the model input arrays with appropriate labels and loss weights, using IGNORE_INDEX for tokens that should not contribute to the training loss.

Usage

Use RenderingPlugin when you need to render chat messages into model-ready tokenized inputs for training or inference with Qwen3 models using the no-think template. The plugin is typically invoked by the rendering system via its registered name qwen3_nothink rather than directly.

Code Reference

Source Location

Repository: Hiyouga_LLaMA_Factory
File: src/llamafactory/v1/plugins/model_plugins/rendering.py
Lines: 1-235

Signature

class RenderingPlugin(BasePlugin):
    def render_messages(
        self,
        processor: Processor,
        messages: list[Message],
        tools: str | None = None,
        is_generate: bool = False,
    ) -> ModelInput: ...

    def parse_messages(self, generated_text: str) -> Message: ...

def _update_model_input(
    processor: Processor,
    input_ids: list[int],
    labels: list[int],
    loss_weights: list[int],
    temp_str: str,
    temp_weight: float,
) -> str: ...

@RenderingPlugin("qwen3_nothink").register("render_messages")
def render_qwen3_nothink_messages(
    processor: Processor,
    messages: list[Message],
    tools: str | None = None,
    is_generate: bool = False,
) -> ModelInput: ...

@RenderingPlugin("qwen3_nothink").register("parse_message")
def parse_qwen3_nothink_message(generated_text: str) -> Message: ...

Import

from llamafactory.v1.plugins.model_plugins.rendering import RenderingPlugin

I/O Contract

Inputs

Name	Type	Required	Description
processor	Processor	Yes	Tokenizer or processor instance used to encode text into token IDs
messages	list[Message]	Yes	List of structured chat messages with role, content, and optional loss_weight
tools	str or None	No	JSON string defining available tools for tool-calling scenarios
is_generate	bool	No	If True, appends the assistant prompt prefix for generation mode
generated_text (parse)	str	Yes	Raw generated text string to be parsed back into a Message

Outputs

Name	Type	Description
render_messages	ModelInput	Dictionary containing input_ids, attention_mask, labels, and loss_weights lists
parse_messages	Message	Structured message with role "assistant" and parsed content items (text, reasoning, tool_call)

Usage Examples

from llamafactory.v1.plugins.model_plugins.rendering import RenderingPlugin

# Render messages for training
plugin = RenderingPlugin("qwen3_nothink")
messages = [
    {"role": "user", "content": [{"type": "text", "value": "Hello!"}], "loss_weight": 0.0},
    {"role": "assistant", "content": [{"type": "text", "value": "Hi there!"}], "loss_weight": 1.0},
]
model_input = plugin.render_messages(processor=tokenizer, messages=messages)

# Render messages for generation (adds assistant prefix)
model_input = plugin.render_messages(processor=tokenizer, messages=messages, is_generate=True)

# Parse generated text back into a message
generated = "Let me think.\n<tool_call>\n{\"name\": \"search\", \"arguments\": {\"q\": \"test\"}}\n</tool_call>"
message = plugin.parse_messages(generated)
# message.content includes text and tool_call items

Related Pages

Hiyouga_LLaMA_Factory_V1_Plugin_System - BasePlugin parent class providing the registration mechanism
Hiyouga_LLaMA_Factory_V1_Types - Type definitions for Message, ModelInput, Content, ToolCall

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment