Implementation:Neuml Txtai REST API Endpoints
Overview
This page documents the HTTP endpoint interfaces exposed by txtai's REST API. The endpoints are defined in two primary router modules: the embeddings router for search, indexing, and vector operations, and the OpenAI router for OpenAI-compatible chat completions, embeddings, and audio endpoints. Both routers use the EncodingAPIRoute class for content-negotiated responses.
Interface Specification
Embeddings Router
Source: src/python/txtai/api/routers/embeddings.py (Lines 1-280)
GET /search
Finds documents most similar to the input query.
| Parameter | Location | Type | Required | Description |
|---|---|---|---|---|
query |
Query string | string |
Yes | Input search query |
limit |
Query string | int |
No | Maximum number of results (default: 10) |
weights |
Query string | float |
No | Hybrid score weights |
index |
Query string | string |
No | Index name for multi-index setups |
parameters |
Query string | JSON string |
No | Named parameters for SQL placeholders |
graph |
Query string | bool |
No | Return graph results if true |
Response: JSON array of {"id": value, "score": value} for index search, or array of dicts for database search.
curl "http://localhost:8000/search?query=machine+learning&limit=5"
POST /batchsearch
Finds documents most similar to multiple input queries in a single request.
| Parameter | Location | Type | Required | Description |
|---|---|---|---|---|
queries |
Body | List[str] |
Yes | List of search queries |
limit |
Body | int |
No | Maximum results per query |
weights |
Body | float |
No | Hybrid score weights |
index |
Body | string |
No | Index name |
parameters |
Body | List[dict] |
No | Parameters per query |
graph |
Body | bool |
No | Return graph results (default: false) |
Response: JSON array of arrays, one result set per query.
import requests
response = requests.post("http://localhost:8000/batchsearch", json={
"queries": ["machine learning", "natural language processing"],
"limit": 5
})
POST /add
Adds a batch of documents for indexing. Requires writable: true in configuration.
| Parameter | Location | Type | Required | Description |
|---|---|---|---|---|
documents |
Body | List[dict] |
Yes | List of {"id": value, "text": value, "tags": value}
|
Response: null on success.
Errors: HTTP 403 if the index is read-only.
import requests
requests.post("http://localhost:8000/add", json=[
{"id": "0", "text": "First document"},
{"id": "1", "text": "Second document"}
])
GET /index
Builds an embeddings index for previously batched documents.
Parameters: None
Response: null on success.
Errors: HTTP 403 if the index is read-only.
curl "http://localhost:8000/index"
GET /upsert
Runs an embeddings upsert operation for previously batched documents (updates existing, adds new).
Parameters: None
Response: null on success.
POST /delete
Deletes documents from the embeddings index by ID.
| Parameter | Location | Type | Required | Description |
|---|---|---|---|---|
ids |
Body | List |
Yes | List of document IDs to delete |
Response: JSON array of deleted IDs.
import requests
response = requests.post("http://localhost:8000/delete", json=["0", "1"])
GET /count
Returns the total number of elements in the embeddings index.
Parameters: None
Response: Integer count.
curl "http://localhost:8000/count"
POST /reindex
Recreates the embeddings index with a new configuration. Only works when content storage is enabled.
| Parameter | Location | Type | Required | Description |
|---|---|---|---|---|
config |
Body | dict |
Yes | New embeddings configuration |
function |
Body | string |
No | Optional function to prepare content |
GET /transform
Transforms text into an embeddings vector.
| Parameter | Location | Type | Required | Description |
|---|---|---|---|---|
text |
Query string | string |
Yes | Input text |
category |
Query string | string |
No | Category for instruction-based embeddings |
index |
Query string | string |
No | Index name |
Response: JSON array of floats (embeddings vector).
POST /batchtransform
Transforms a list of texts into embeddings vectors.
| Parameter | Location | Type | Required | Description |
|---|---|---|---|---|
texts |
Body | List[str] |
Yes | List of texts to transform |
category |
Query string | string |
No | Category for instruction-based embeddings |
index |
Query string | string |
No | Index name |
Response: JSON array of arrays of floats.
POST /explain
Explains the importance of each input token relative to a query.
| Parameter | Location | Type | Required | Description |
|---|---|---|---|---|
query |
Body | string |
Yes | Query text |
texts |
Body | List[str] |
No | Texts to explain against (runs search if omitted) |
limit |
Body | int |
No | Result limit if texts is omitted |
Response: JSON array of dicts with token-level importance scores.
OpenAI-Compatible Router
Source: src/python/txtai/api/routers/openai.py (Lines 1-191)
POST /v1/chat/completions
Runs a chat completion request, routing to the appropriate txtai component based on the model parameter.
| Parameter | Location | Type | Required | Description |
|---|---|---|---|---|
messages |
Body | List[dict] |
Yes | Messages array: [{"role": "user", "content": "text"}]
|
model |
Body | string |
Yes | Agent name, workflow name, pipeline name, or "embeddings"
|
max_completion_tokens |
Body | int |
No | Maximum tokens to generate |
stream |
Body | bool |
No | Enable streaming response (default: false) |
Response (non-streaming):
{
"id": "uuid",
"object": "chat.completion",
"created": 1234567890,
"model": "model-name",
"choices": [{"id": 0, "message": {"role": "assistant", "content": "response text"}, "finish_reason": "stop"}]
}
Response (streaming): Server-Sent Events with chat.completion.chunk objects.
Routing logic:
- If
modelmatches an agent name, routes to the agent - If
modelis"embeddings", runs a search and returns top result text - If
modelmatches a pipeline name (except"llm"), routes to that pipeline - If
modelmatches a workflow name, routes to that workflow - Otherwise, sends all messages through the default LLM pipeline
import requests
response = requests.post("http://localhost:8000/v1/chat/completions", json={
"messages": [{"role": "user", "content": "What is machine learning?"}],
"model": "my-agent"
})
POST /v1/embeddings
Creates embeddings vectors for input text, following the OpenAI embeddings API format.
| Parameter | Location | Type | Required | Description |
|---|---|---|---|---|
input |
Body | string or List[str] |
Yes | Text(s) to embed |
model |
Body | string |
Yes | Model name (informational, uses configured model) |
Response:
{
"object": "list",
"data": [{"object": "embedding", "embedding": [0.1, 0.2, ...], "index": 0}],
"model": "model-name"
}
POST /v1/audio/speech
Generates speech audio from input text.
| Parameter | Location | Type | Required | Description |
|---|---|---|---|---|
input |
Body | string |
Yes | Text to synthesize |
voice |
Body | string |
Yes | Speaker name |
response_format |
Body | string |
No | Audio format (default: "mp3")
|
Response: Raw audio bytes.
POST /v1/audio/transcriptions
Transcribes audio to text.
| Parameter | Location | Type | Required | Description |
|---|---|---|---|---|
file |
Form upload | file |
Yes | Audio file to transcribe |
language |
Form | string |
No | Language of input audio |
response_format |
Form | string |
No | Output format: "json" (default) or "text"
|
Response: {"text": "transcribed text"} or plain text.
Content Negotiation
Both routers use EncodingAPIRoute (defined in src/python/txtai/api/route.py) to support content-negotiated responses. The response format is determined by the Accept header on each request, processed through ResponseFactory.create(request).
See Also
- Neuml_Txtai_REST_API_Design - Principle behind the REST API design patterns
- Neuml_Txtai_API_Create - How these routers are dynamically registered during startup
- Neuml_Txtai_Authorization_Init - How endpoints are protected by token authentication