Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Neuml Txtai REST API Design

From Leeroopedia
Revision as of 17:35, 16 February 2026 by Admin (talk | contribs) (Auto-imported from principles/Neuml_Txtai_REST_API_Design.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


Overview

txtai exposes its AI capabilities through a RESTful HTTP API built on FastAPI. The API follows REST conventions for resource-oriented endpoints while also providing an OpenAI-compatible interface that allows txtai to serve as a drop-in replacement for OpenAI API calls. The endpoint design supports content negotiation, batch operations, and streaming responses.

Theoretical Foundation

REST API Design Principles

txtai's API adheres to standard REST conventions:

  • HTTP methods indicate intent: GET for read operations, POST for write operations and complex queries
  • URL paths identify resources: /search, /count, /transform map to embeddings operations
  • Query parameters for simple filters: single-value search uses GET /search?query=text&limit=10
  • Request bodies for complex data: batch operations use POST with JSON bodies
  • HTTP status codes communicate outcomes: 200 for success, 401 for unauthorized, 403 for read-only violations, 422 for validation errors

Resource-Oriented Endpoint Design

The txtai API organizes endpoints around the core abstractions:

Category Endpoints Purpose
Search GET /search, POST /batchsearch Query the embeddings index
Indexing POST /add, GET /index, GET /upsert Add documents and build indexes
Management POST /delete, POST /reindex, GET /count Manage index contents
Vectors GET /transform, POST /batchtransform Generate embeddings vectors
Analysis POST /explain, POST /batchexplain Token importance analysis
OpenAI POST /v1/chat/completions, POST /v1/embeddings OpenAI-compatible interface

Batch Operation Pattern

For operations that benefit from batching, txtai provides paired endpoints:

  • Single: GET /search?query=text -- simple query with URL parameters
  • Batch: POST /batchsearch -- multiple queries in a single request body

The batch pattern reduces HTTP overhead when processing multiple items and allows the backend to optimize batch processing (e.g., batched model inference).

Content Negotiation

txtai implements content negotiation through the HTTP Accept header. The EncodingAPIRoute class inspects each request's Accept header and selects an appropriate response encoder:

  • JSON (default): standard JSON serialization
  • MessagePack: binary serialization for higher throughput
  • Custom encodings: extensible through ResponseFactory

This is implemented via a custom APIRoute class that overrides FastAPI's default route handler:

class EncodingAPIRoute(APIRoute):
    def get_route_handler(self):
        async def handler(request):
            route = get_request_handler(
                ...,
                response_class=ResponseFactory.create(request),
                ...
            )
            return await route(request)
        return handler

The response class is determined per request, allowing different clients to receive the same data in their preferred format.

OpenAI API Compatibility

Design Philosophy

txtai provides endpoints that mirror the OpenAI API specification, allowing clients built for OpenAI's API to work with txtai without modification. This compatibility layer supports:

  • POST /v1/chat/completions -- maps to agents, pipelines, workflows, or embeddings search
  • POST /v1/embeddings -- generates embeddings vectors
  • POST /v1/audio/speech -- text-to-speech synthesis
  • POST /v1/audio/transcriptions -- speech-to-text transcription
  • POST /v1/audio/translations -- audio translation to English

Model Parameter as Router

The OpenAI-compatible /v1/chat/completions endpoint uses the model parameter to determine which txtai component handles the request:

model Value Routes To Description
Agent name app.agent(model, ...) Executes an LLM-driven agent
"embeddings" app.search(...) Runs an embeddings search, returns top result text
Pipeline name app.pipeline(model, ...) Executes a named pipeline
Workflow name app.workflow(model, ...) Executes a named workflow
anything else app.pipeline("llm", ...) Falls back to the default LLM pipeline

This design allows a single endpoint to expose the full range of txtai's capabilities through a familiar interface.

Streaming Responses

When stream: true is set in a chat completion request, txtai returns a Server-Sent Events (SSE) stream. Each chunk follows the OpenAI streaming format:

data: {"id": "uuid", "object": "chat.completion.chunk", "model": "agent-name", "choices": [{"delta": {"content": "chunk text"}}]}

The stream terminates with:

data: [DONE]

This enables real-time token-by-token output for LLM-based responses.

Error Handling Patterns

The API uses standard HTTP status codes with descriptive error messages:

Status Meaning Example Trigger
200 Success Search returns results
401 Unauthorized Missing or invalid authorization token
403 Forbidden Write operation on read-only index (writable != True)
422 Validation Error Mismatched array lengths in /addobject

Write operations (add, index, delete, reindex) catch ReadOnlyError and translate it to HTTP 403:

try:
    application.get().add(documents)
except ReadOnlyError as e:
    raise HTTPException(status_code=403, detail=e.args[0]) from e

Design Rationale

GET vs POST Selection

  • GET is used for idempotent operations with simple parameters: /search, /count, /index, /transform
  • POST is used for operations with complex request bodies or side effects: /add, /batchsearch, /delete

The choice of GET /index (rather than POST) for the index-building operation is notable -- it triggers index construction from previously batched documents. While this has side effects, it is designed as a command endpoint rather than a resource creation endpoint.

Why OpenAI Compatibility

Providing an OpenAI-compatible interface serves several purposes:

  • Ecosystem integration: tools built for OpenAI (LangChain, LlamaIndex, etc.) can work with txtai
  • Migration path: teams can switch from OpenAI to local models without changing client code
  • Standardization: the OpenAI API has become a de facto standard for LLM interaction

See Also

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment