Implementation:Vespa engine Vespa IndexingProcessor ErrorHandling

Knowledge Sources	Vespa
Domains	Document_Processing, Indexing
Last Updated	2026-02-09 00:00 GMT

Overview

Concrete tool for classifying and handling errors during document indexing, provided by Vespa's document processing framework.

Description

The error handling paths within IndexingProcessor.process() implement structured exception-to-progress mapping. The processing loop wraps each document operation in a try-catch block that catches three specific exception types and maps each to the corresponding Progress result:

InvalidInputException is caught and mapped to Progress.INVALID_INPUT.
OverloadException is caught and mapped to Progress.OVERLOAD.
TimeoutException is caught and mapped to Progress.TIMEOUT.

Each mapping uses the withReason() method to attach a diagnostic string that includes the document ID (op.getId()) and the exception message. This provides callers with enough context to identify the problematic document and understand the failure cause.

Unhandled RuntimeException instances are not caught by this code and propagate up the call stack, typically resulting in a 500-level error response to the feeding client.

The error handling follows fail-fast semantics: when any operation in the batch fails, the entire batch processing is aborted and the error progress is returned immediately. Operations that were already processed successfully are discarded, and none of the batch's results are committed.

Usage

This error handling logic is an integral part of IndexingProcessor.process() and is invoked automatically for every document operation. It is not intended to be called independently.

Use this implementation reference when:

You need to understand how specific exceptions map to progress results in the indexing pipeline.
You are debugging error responses received by document feeding clients.
You want to understand the fail-fast behavior when a batch contains a problematic document.
You are implementing custom document processors and need to follow the same error handling conventions.

Code Reference

Source Location

Repository: Vespa
File: docprocs/src/main/java/com/yahoo/docprocs/indexing/IndexingProcessor.java
Lines: 115-139

Signature

// Error handling paths within:
@Override
public Progress process(Processing proc)

Import

import com.yahoo.docprocs.indexing.IndexingProcessor;
import com.yahoo.document.DocumentOperation;
import com.yahoo.docproc.DocumentProcessor.Progress;

Error Handling Code

for (var op : proc.getDocumentOperations()) {
    try {
        if (op instanceof DocumentPut dp) {
            processDocument(dp, out, deadline);
        } else if (op instanceof DocumentUpdate du) {
            processUpdate(du, out, deadline);
        } else if (op instanceof DocumentRemove dr) {
            processRemove(dr, out);
        } else if (op != null) {
            throw new IllegalArgumentException(
                "Document class " + op.getClass().getName() + " not supported.");
        } else {
            throw new IllegalArgumentException("Expected document, got null.");
        }
    } catch (InvalidInputException e) {
        return Progress.INVALID_INPUT.withReason(
            "Document '" + op.getId() + "': " + e.getMessage());
    } catch (OverloadException e) {
        return Progress.OVERLOAD.withReason(
            "Document '" + op.getId() + "': " + e.getMessage());
    } catch (TimeoutException e) {
        return Progress.TIMEOUT.withReason(
            "Document '" + op.getId() + "': " + e.getMessage());
    }
}

I/O Contract

Inputs

Name	Type	Required	Description
proc	`Processing`	Yes	The processing context containing document operations. Each operation is processed within the try-catch block that implements error classification.
op	`DocumentOperation`	Yes	The individual document operation being processed. Its ID is included in error reason strings for diagnostic purposes.

Outputs

Name	Type	Description
Progress.DONE	`Progress`	Returned when all operations in the batch are processed successfully.
Progress.INVALID_INPUT	`Progress`	Returned when a document operation fails validation. Includes a reason string with the document ID and error message. Indicates a permanent failure that should not be retried.
Progress.OVERLOAD	`Progress`	Returned when processing fails due to resource constraints. Includes a reason string with the document ID and error message. Indicates a transient failure that should be retried with backoff.
Progress.TIMEOUT	`Progress`	Returned when processing exceeds the deadline. Includes a reason string with the document ID and error message. Indicates a transient failure that may be retried.

Exception-to-Progress Mapping

Exception Type	Progress Result	Retry Semantics	Typical Cause
`InvalidInputException`	`Progress.INVALID_INPUT`	Do not retry	Malformed document, undeclared fields, type mismatch
`OverloadException`	`Progress.OVERLOAD`	Retry with backoff	Resource exhaustion, downstream throttling
`TimeoutException`	`Progress.TIMEOUT`	Retry with same or longer timeout	Processing exceeded deadline, slow external services
`IllegalArgumentException`	Propagates (uncaught)	Do not retry	Null operation or unsupported operation type
`RuntimeException`	Propagates (uncaught)	Depends on cause	Bug in processing logic, infrastructure failure

Usage Examples

// Example: Handling the Progress result from IndexingProcessor

Processing processing = new Processing();
processing.getDocumentOperations().add(new DocumentPut(document));

Progress result = indexingProcessor.process(processing);

switch (result) {
    case Progress p when p == Progress.DONE:
        // Success: forward processed operations downstream
        forwardToContentLayer(processing.getDocumentOperations());
        break;

    case Progress p when p.equals(Progress.INVALID_INPUT):
        // Permanent failure: log and reject
        log.severe("Invalid input: " + p.getReason());
        rejectDocument(document.getId());
        break;

    case Progress p when p.equals(Progress.OVERLOAD):
        // Transient failure: back off and retry
        log.warning("Overload: " + p.getReason());
        scheduleRetryWithBackoff(processing);
        break;

    case Progress p when p.equals(Progress.TIMEOUT):
        // Transient failure: retry with possible adjustments
        log.warning("Timeout: " + p.getReason());
        scheduleRetryWithLongerTimeout(processing);
        break;
}

Related Pages

Implements Principle

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment