Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Neuml Txtai LiteLLM Pipeline

From Leeroopedia


Knowledge Sources
Domains Machine Learning, NLP, LLM, API Integration
Last Updated 2026-02-10 01:00 GMT

Overview

Concrete tool for accessing LLM APIs (OpenAI, Anthropic, Cohere, and others) through the LiteLLM unified interface provided by txtai.

Description

LiteLLM is a generative model backend that extends the Generation base class and delegates inference to the litellm library. This enables txtai to call over 100+ LLM providers (OpenAI, Anthropic, Azure, AWS Bedrock, Google Vertex, Cohere, Hugging Face Inference API, etc.) through a single unified interface. The class includes a static ismodel method that detects whether a given model path corresponds to a LiteLLM-supported provider (while filtering out Hugging Face Hub models). It supports both streaming and non-streaming responses.

Usage

Use LiteLLM when you want to call cloud-hosted LLM APIs from within a txtai pipeline. It is automatically selected when the model path matches a known LiteLLM provider (e.g. "gpt-4", "claude-3-opus", "anthropic/claude-3-sonnet").

Code Reference

Source Location

  • Repository: Neuml_Txtai
  • File: src/python/txtai/pipeline/llm/litellm.py

Signature

class LiteLLM(Generation):
    @staticmethod
    def ismodel(path)
    @staticmethod
    def ishub(path)
    def __init__(self, path, template=None, **kwargs)
    def stream(self, texts, maxlength, stream, stop, **kwargs)

Import

from txtai.pipeline.llm.litellm import LiteLLM

I/O Contract

Inputs

Name Type Required Description
path str Yes Model identifier recognized by LiteLLM (e.g. "gpt-4", "anthropic/claude-3-sonnet", "cohere/command-r").
template str No Prompt template string applied to text inputs.
kwargs dict No Additional keyword arguments passed to litellm.completion. Common pipeline params (quantize, gpu, model, task) are automatically filtered out.
texts list Yes (stream) List of prompts; each can be a string or a list of chat message dicts.
maxlength int Yes (stream) Maximum number of tokens to generate (passed as max_tokens).
stream bool Yes (stream) If True, streams the response.
stop list Yes (stream) List of stop strings.

Outputs

Name Type Description
result generator Yields generated text chunks from the LLM API response.

Usage Examples

from txtai.pipeline import LLM

# Use OpenAI GPT-4 via LiteLLM
llm = LLM("gpt-4")
result = llm("What is the capital of France?")
print(result)

# Use Anthropic Claude via LiteLLM
llm = LLM("anthropic/claude-3-sonnet-20240229")
result = llm("Explain quantum computing in simple terms")

# Chat-style input
result = llm([
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Summarize the theory of relativity."}
])

# Streaming
for token in llm("Tell me about space exploration", stream=True):
    print(token, end="")

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment