Principle:Neuml Txtai REST API Design
Overview
txtai exposes its AI capabilities through a RESTful HTTP API built on FastAPI. The API follows REST conventions for resource-oriented endpoints while also providing an OpenAI-compatible interface that allows txtai to serve as a drop-in replacement for OpenAI API calls. The endpoint design supports content negotiation, batch operations, and streaming responses.
Theoretical Foundation
REST API Design Principles
txtai's API adheres to standard REST conventions:
- HTTP methods indicate intent:
GETfor read operations,POSTfor write operations and complex queries - URL paths identify resources:
/search,/count,/transformmap to embeddings operations - Query parameters for simple filters: single-value search uses
GET /search?query=text&limit=10 - Request bodies for complex data: batch operations use
POSTwith JSON bodies - HTTP status codes communicate outcomes: 200 for success, 401 for unauthorized, 403 for read-only violations, 422 for validation errors
Resource-Oriented Endpoint Design
The txtai API organizes endpoints around the core abstractions:
| Category | Endpoints | Purpose |
|---|---|---|
| Search | GET /search, POST /batchsearch |
Query the embeddings index |
| Indexing | POST /add, GET /index, GET /upsert |
Add documents and build indexes |
| Management | POST /delete, POST /reindex, GET /count |
Manage index contents |
| Vectors | GET /transform, POST /batchtransform |
Generate embeddings vectors |
| Analysis | POST /explain, POST /batchexplain |
Token importance analysis |
| OpenAI | POST /v1/chat/completions, POST /v1/embeddings |
OpenAI-compatible interface |
Batch Operation Pattern
For operations that benefit from batching, txtai provides paired endpoints:
- Single:
GET /search?query=text-- simple query with URL parameters - Batch:
POST /batchsearch-- multiple queries in a single request body
The batch pattern reduces HTTP overhead when processing multiple items and allows the backend to optimize batch processing (e.g., batched model inference).
Content Negotiation
txtai implements content negotiation through the HTTP Accept header. The EncodingAPIRoute class inspects each request's Accept header and selects an appropriate response encoder:
- JSON (default): standard JSON serialization
- MessagePack: binary serialization for higher throughput
- Custom encodings: extensible through
ResponseFactory
This is implemented via a custom APIRoute class that overrides FastAPI's default route handler:
class EncodingAPIRoute(APIRoute):
def get_route_handler(self):
async def handler(request):
route = get_request_handler(
...,
response_class=ResponseFactory.create(request),
...
)
return await route(request)
return handler
The response class is determined per request, allowing different clients to receive the same data in their preferred format.
OpenAI API Compatibility
Design Philosophy
txtai provides endpoints that mirror the OpenAI API specification, allowing clients built for OpenAI's API to work with txtai without modification. This compatibility layer supports:
POST /v1/chat/completions-- maps to agents, pipelines, workflows, or embeddings searchPOST /v1/embeddings-- generates embeddings vectorsPOST /v1/audio/speech-- text-to-speech synthesisPOST /v1/audio/transcriptions-- speech-to-text transcriptionPOST /v1/audio/translations-- audio translation to English
Model Parameter as Router
The OpenAI-compatible /v1/chat/completions endpoint uses the model parameter to determine which txtai component handles the request:
| model Value | Routes To | Description |
|---|---|---|
| Agent name | app.agent(model, ...) |
Executes an LLM-driven agent |
"embeddings" |
app.search(...) |
Runs an embeddings search, returns top result text |
| Pipeline name | app.pipeline(model, ...) |
Executes a named pipeline |
| Workflow name | app.workflow(model, ...) |
Executes a named workflow |
| anything else | app.pipeline("llm", ...) |
Falls back to the default LLM pipeline |
This design allows a single endpoint to expose the full range of txtai's capabilities through a familiar interface.
Streaming Responses
When stream: true is set in a chat completion request, txtai returns a Server-Sent Events (SSE) stream. Each chunk follows the OpenAI streaming format:
data: {"id": "uuid", "object": "chat.completion.chunk", "model": "agent-name", "choices": [{"delta": {"content": "chunk text"}}]}
The stream terminates with:
data: [DONE]
This enables real-time token-by-token output for LLM-based responses.
Error Handling Patterns
The API uses standard HTTP status codes with descriptive error messages:
| Status | Meaning | Example Trigger |
|---|---|---|
| 200 | Success | Search returns results |
| 401 | Unauthorized | Missing or invalid authorization token |
| 403 | Forbidden | Write operation on read-only index (writable != True)
|
| 422 | Validation Error | Mismatched array lengths in /addobject
|
Write operations (add, index, delete, reindex) catch ReadOnlyError and translate it to HTTP 403:
try:
application.get().add(documents)
except ReadOnlyError as e:
raise HTTPException(status_code=403, detail=e.args[0]) from e
Design Rationale
GET vs POST Selection
- GET is used for idempotent operations with simple parameters:
/search,/count,/index,/transform - POST is used for operations with complex request bodies or side effects:
/add,/batchsearch,/delete
The choice of GET /index (rather than POST) for the index-building operation is notable -- it triggers index construction from previously batched documents. While this has side effects, it is designed as a command endpoint rather than a resource creation endpoint.
Why OpenAI Compatibility
Providing an OpenAI-compatible interface serves several purposes:
- Ecosystem integration: tools built for OpenAI (LangChain, LlamaIndex, etc.) can work with txtai
- Migration path: teams can switch from OpenAI to local models without changing client code
- Standardization: the OpenAI API has become a de facto standard for LLM interaction
See Also
- Neuml_Txtai_REST_API_Endpoints - Implementation of the HTTP endpoint interfaces
- Neuml_Txtai_API_Server_Bootstrap - How routes are dynamically registered based on configuration
- Neuml_Txtai_API_Security - How endpoints are secured with token authentication