Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Hiyouga LLaMA Factory V1 Base Sampler

From Leeroopedia


Knowledge Sources
Domains Machine Learning, Text Generation
Last Updated 2026-02-06 19:00 GMT

Overview

BaseSampler is the base class for asynchronous text generation sampling that delegates to a configurable inference engine backend.

Description

The BaseSampler class provides the abstract sampling interface for the LLaMA-Factory v1 system. It initializes an inference engine based on the configured SampleBackend setting (currently supporting the HuggingFace backend) and exposes two async methods: generate for streaming token-by-token generation and batch_infer for batch inference over a dataset. Concrete samplers such as the CLI sampler extend this class to implement specific user-facing workflows while reusing the same inference backend.

Usage

Use BaseSampler when building a new sampler implementation that needs to perform asynchronous text generation or batch inference. Subclass it to add custom pre/post-processing logic around the core generation pipeline. This class is typically not instantiated directly but rather through its subclasses like CLISampler.

Code Reference

Source Location

Signature

class BaseSampler:
    def __init__(
        self,
        args: SampleArguments,
        model_args: ModelArguments,
        model: HFModel,
        renderer: Renderer,
    ) -> None: ...

    async def generate(
        self, messages: list[Message], tools: str | None = None
    ) -> AsyncGenerator[str, None]: ...

    async def batch_infer(self, dataset: TorchDataset) -> list[Sample]: ...

Import

from llamafactory.v1.core.base_sampler import BaseSampler

I/O Contract

Inputs

Name Type Required Description
args SampleArguments Yes Sample configuration arguments including the sample backend setting.
model_args ModelArguments Yes Model configuration arguments for initializing the inference engine.
model HFModel Yes The HuggingFace model instance used for generation.
renderer Renderer Yes The renderer used for template-based message formatting.
messages list[Message] Yes (generate) List of conversation messages to generate a response for.
tools str or None No (generate) Optional tools string for tool-augmented generation.
dataset TorchDataset Yes (batch_infer) A PyTorch dataset for batch inference.

Outputs

Name Type Description
generate return AsyncGenerator[str, None] Asynchronous stream of generated tokens (strings).
batch_infer return list[Sample] List of inferred samples from the dataset.

Usage Examples

# Subclassing BaseSampler for a custom sampler
from llamafactory.v1.core.base_sampler import BaseSampler

class CustomSampler(BaseSampler):
    async def run_chat(self, messages):
        response = ""
        async for token in self.generate(messages):
            response += token
        return response

# Instantiation
sampler = CustomSampler(
    args=sample_args,
    model_args=model_args,
    model=model,
    renderer=renderer,
)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment