Heuristic:Neuml Txtai Thread Safety Constraints
| Knowledge Sources | |
|---|---|
| Domains | Debugging, Infrastructure |
| Last Updated | 2026-02-10 00:00 GMT |
Overview
Thread safety rules for txtai embeddings: reads are thread-safe but writes require external synchronization.
Description
txtai Embeddings indexes are designed to be thread-safe for read operations (search, count, info) but not thread-safe for write operations (index, upsert, delete). This is documented in the Embeddings class docstring. Database connections similarly require external thread locking. Scheduled workflows catch and log exceptions but continue running on schedule, which can mask concurrency issues.
Usage
Apply this constraint when deploying txtai in multi-threaded environments such as API servers, web applications, or worker pools. Use external locks (e.g., `threading.Lock()`) when calling write operations from multiple threads. Read operations can be called freely from concurrent threads.
The Insight (Rule of Thumb)
- Action: Use external synchronization (locks, queues) for all write operations on Embeddings instances in multi-threaded environments.
- Value: Reads are fully concurrent. Writes must be serialized.
- Trade-off: No write lock overhead for read-heavy workloads (the common case for search APIs). Write-heavy workloads need explicit coordination.
Additional rules:
- Use `torch.no_grad()` during inference to disable gradient tracking and reduce memory.
- Embeddings are streamed to disk during indexing to control memory usage for large datasets.
- Scheduled workflows swallow exceptions (log + continue) so concurrent errors do not crash the scheduler.
Reasoning
The read-safe/write-unsafe design is standard for in-memory index structures (similar to Python's dict). Writes modify internal state (ANN index, database, scoring) in ways that are not atomic. The FastAPI server handles this naturally since each request is processed on the event loop, but thread pool executors or explicit threading require care. The `torch.no_grad()` context manager is critical for inference because gradient tracking in transformers models can consume 2-3x more memory than the forward pass alone.
Code Evidence
Thread safety docstring from `embeddings/base.py:32`:
Creates a new embeddings index. Embeddings indexes are thread-safe for read operations but writes must be synchronized.
Database thread locking note from `database/rdbms.py:34`:
# Load an existing database. Thread locking must be handled externally.
torch.no_grad() usage from `pipeline/tensors.py:43-52`:
def context(self):
"""
Defines a context used to wrap processing.
Returns:
processing context
"""
return torch.no_grad()
Scheduled workflow error handling from `workflow/base.py:102-107`:
try:
for _ in self(elements):
pass
except Exception:
logger.error(traceback.format_exc())
Embeddings streamed to disk from `vectors/base.py:129-132`:
# Convert all documents to embedding arrays, stream embeddings to disk to control memory usage
with self.spool(checkpoint, vectorsid) as output:
stream = output.name