Principle:Infiniflow Ragflow LLM Configuration
| Knowledge Sources | |
|---|---|
| Domains | RAG, Conversational_AI, NLP |
| Last Updated | 2026-02-12 06:00 GMT |
Overview
A configuration pattern that selects and tunes the large language model used for generating RAG-powered responses.
Description
LLM Configuration selects which language model generates responses and how it behaves. RAGFlow supports 66+ LLM providers via a factory catalog. Key generation parameters include temperature (creativity vs determinism), top_p (nucleus sampling threshold), frequency_penalty (discourage repetition), presence_penalty (encourage topic diversity), and max_tokens (response length limit).
Usage
Configure when creating or updating a chat application. Different use cases benefit from different settings (low temperature for factual QA, higher for creative responses).
Theoretical Basis
LLM generation parameters control the sampling distribution:
- Temperature: Scales logits before softmax; lower values (0.1) produce more deterministic outputs
- Top-p: Only samples from tokens whose cumulative probability exceeds p
- Penalties: Modify token probabilities based on their frequency/presence in generated text