Implementation:Vespa engine Vespa IndexingProcessor ErrorHandling
| Knowledge Sources | |
|---|---|
| Domains | Document_Processing, Indexing |
| Last Updated | 2026-02-09 00:00 GMT |
Overview
Concrete tool for classifying and handling errors during document indexing, provided by Vespa's document processing framework.
Description
The error handling paths within IndexingProcessor.process() implement structured exception-to-progress mapping. The processing loop wraps each document operation in a try-catch block that catches three specific exception types and maps each to the corresponding Progress result:
InvalidInputExceptionis caught and mapped toProgress.INVALID_INPUT.OverloadExceptionis caught and mapped toProgress.OVERLOAD.TimeoutExceptionis caught and mapped toProgress.TIMEOUT.
Each mapping uses the withReason() method to attach a diagnostic string that includes the document ID (op.getId()) and the exception message. This provides callers with enough context to identify the problematic document and understand the failure cause.
Unhandled RuntimeException instances are not caught by this code and propagate up the call stack, typically resulting in a 500-level error response to the feeding client.
The error handling follows fail-fast semantics: when any operation in the batch fails, the entire batch processing is aborted and the error progress is returned immediately. Operations that were already processed successfully are discarded, and none of the batch's results are committed.
Usage
This error handling logic is an integral part of IndexingProcessor.process() and is invoked automatically for every document operation. It is not intended to be called independently.
Use this implementation reference when:
- You need to understand how specific exceptions map to progress results in the indexing pipeline.
- You are debugging error responses received by document feeding clients.
- You want to understand the fail-fast behavior when a batch contains a problematic document.
- You are implementing custom document processors and need to follow the same error handling conventions.
Code Reference
Source Location
- Repository: Vespa
- File:
docprocs/src/main/java/com/yahoo/docprocs/indexing/IndexingProcessor.java - Lines: 115-139
Signature
// Error handling paths within:
@Override
public Progress process(Processing proc)
Import
import com.yahoo.docprocs.indexing.IndexingProcessor;
import com.yahoo.document.DocumentOperation;
import com.yahoo.docproc.DocumentProcessor.Progress;
Error Handling Code
for (var op : proc.getDocumentOperations()) {
try {
if (op instanceof DocumentPut dp) {
processDocument(dp, out, deadline);
} else if (op instanceof DocumentUpdate du) {
processUpdate(du, out, deadline);
} else if (op instanceof DocumentRemove dr) {
processRemove(dr, out);
} else if (op != null) {
throw new IllegalArgumentException(
"Document class " + op.getClass().getName() + " not supported.");
} else {
throw new IllegalArgumentException("Expected document, got null.");
}
} catch (InvalidInputException e) {
return Progress.INVALID_INPUT.withReason(
"Document '" + op.getId() + "': " + e.getMessage());
} catch (OverloadException e) {
return Progress.OVERLOAD.withReason(
"Document '" + op.getId() + "': " + e.getMessage());
} catch (TimeoutException e) {
return Progress.TIMEOUT.withReason(
"Document '" + op.getId() + "': " + e.getMessage());
}
}
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| proc | Processing |
Yes | The processing context containing document operations. Each operation is processed within the try-catch block that implements error classification. |
| op | DocumentOperation |
Yes | The individual document operation being processed. Its ID is included in error reason strings for diagnostic purposes. |
Outputs
| Name | Type | Description |
|---|---|---|
| Progress.DONE | Progress |
Returned when all operations in the batch are processed successfully. |
| Progress.INVALID_INPUT | Progress |
Returned when a document operation fails validation. Includes a reason string with the document ID and error message. Indicates a permanent failure that should not be retried. |
| Progress.OVERLOAD | Progress |
Returned when processing fails due to resource constraints. Includes a reason string with the document ID and error message. Indicates a transient failure that should be retried with backoff. |
| Progress.TIMEOUT | Progress |
Returned when processing exceeds the deadline. Includes a reason string with the document ID and error message. Indicates a transient failure that may be retried. |
Exception-to-Progress Mapping
| Exception Type | Progress Result | Retry Semantics | Typical Cause |
|---|---|---|---|
InvalidInputException |
Progress.INVALID_INPUT |
Do not retry | Malformed document, undeclared fields, type mismatch |
OverloadException |
Progress.OVERLOAD |
Retry with backoff | Resource exhaustion, downstream throttling |
TimeoutException |
Progress.TIMEOUT |
Retry with same or longer timeout | Processing exceeded deadline, slow external services |
IllegalArgumentException |
Propagates (uncaught) | Do not retry | Null operation or unsupported operation type |
RuntimeException |
Propagates (uncaught) | Depends on cause | Bug in processing logic, infrastructure failure |
Usage Examples
// Example: Handling the Progress result from IndexingProcessor
Processing processing = new Processing();
processing.getDocumentOperations().add(new DocumentPut(document));
Progress result = indexingProcessor.process(processing);
switch (result) {
case Progress p when p == Progress.DONE:
// Success: forward processed operations downstream
forwardToContentLayer(processing.getDocumentOperations());
break;
case Progress p when p.equals(Progress.INVALID_INPUT):
// Permanent failure: log and reject
log.severe("Invalid input: " + p.getReason());
rejectDocument(document.getId());
break;
case Progress p when p.equals(Progress.OVERLOAD):
// Transient failure: back off and retry
log.warning("Overload: " + p.getReason());
scheduleRetryWithBackoff(processing);
break;
case Progress p when p.equals(Progress.TIMEOUT):
// Transient failure: retry with possible adjustments
log.warning("Timeout: " + p.getReason());
scheduleRetryWithLongerTimeout(processing);
break;
}