Implementation:Hiyouga LLaMA Factory V1 Base Sampler
| Knowledge Sources | |
|---|---|
| Domains | Machine Learning, Text Generation |
| Last Updated | 2026-02-06 19:00 GMT |
Overview
BaseSampler is the base class for asynchronous text generation sampling that delegates to a configurable inference engine backend.
Description
The BaseSampler class provides the abstract sampling interface for the LLaMA-Factory v1 system. It initializes an inference engine based on the configured SampleBackend setting (currently supporting the HuggingFace backend) and exposes two async methods: generate for streaming token-by-token generation and batch_infer for batch inference over a dataset. Concrete samplers such as the CLI sampler extend this class to implement specific user-facing workflows while reusing the same inference backend.
Usage
Use BaseSampler when building a new sampler implementation that needs to perform asynchronous text generation or batch inference. Subclass it to add custom pre/post-processing logic around the core generation pipeline. This class is typically not instantiated directly but rather through its subclasses like CLISampler.
Code Reference
Source Location
- Repository: Hiyouga_LLaMA_Factory
- File: src/llamafactory/v1/core/base_sampler.py
- Lines: 1-67
Signature
class BaseSampler:
def __init__(
self,
args: SampleArguments,
model_args: ModelArguments,
model: HFModel,
renderer: Renderer,
) -> None: ...
async def generate(
self, messages: list[Message], tools: str | None = None
) -> AsyncGenerator[str, None]: ...
async def batch_infer(self, dataset: TorchDataset) -> list[Sample]: ...
Import
from llamafactory.v1.core.base_sampler import BaseSampler
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| args | SampleArguments | Yes | Sample configuration arguments including the sample backend setting. |
| model_args | ModelArguments | Yes | Model configuration arguments for initializing the inference engine. |
| model | HFModel | Yes | The HuggingFace model instance used for generation. |
| renderer | Renderer | Yes | The renderer used for template-based message formatting. |
| messages | list[Message] | Yes (generate) | List of conversation messages to generate a response for. |
| tools | str or None | No (generate) | Optional tools string for tool-augmented generation. |
| dataset | TorchDataset | Yes (batch_infer) | A PyTorch dataset for batch inference. |
Outputs
| Name | Type | Description |
|---|---|---|
| generate return | AsyncGenerator[str, None] | Asynchronous stream of generated tokens (strings). |
| batch_infer return | list[Sample] | List of inferred samples from the dataset. |
Usage Examples
# Subclassing BaseSampler for a custom sampler
from llamafactory.v1.core.base_sampler import BaseSampler
class CustomSampler(BaseSampler):
async def run_chat(self, messages):
response = ""
async for token in self.generate(messages):
response += token
return response
# Instantiation
sampler = CustomSampler(
args=sample_args,
model_args=model_args,
model=model,
renderer=renderer,
)
Related Pages
- Hiyouga_LLaMA_Factory_V1_Inference_Engine - The underlying inference engine that BaseSampler delegates to.
- Hiyouga_LLaMA_Factory_V1_Rendering - The Renderer class used for message formatting.
- Hiyouga_LLaMA_Factory_V1_Model_Engine - The ModelEngine that provides the model and renderer.