Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Vespa engine Vespa IndexingProcessor ErrorHandling

From Leeroopedia


Knowledge Sources
Domains Document_Processing, Indexing
Last Updated 2026-02-09 00:00 GMT

Overview

Concrete tool for classifying and handling errors during document indexing, provided by Vespa's document processing framework.

Description

The error handling paths within IndexingProcessor.process() implement structured exception-to-progress mapping. The processing loop wraps each document operation in a try-catch block that catches three specific exception types and maps each to the corresponding Progress result:

  • InvalidInputException is caught and mapped to Progress.INVALID_INPUT.
  • OverloadException is caught and mapped to Progress.OVERLOAD.
  • TimeoutException is caught and mapped to Progress.TIMEOUT.

Each mapping uses the withReason() method to attach a diagnostic string that includes the document ID (op.getId()) and the exception message. This provides callers with enough context to identify the problematic document and understand the failure cause.

Unhandled RuntimeException instances are not caught by this code and propagate up the call stack, typically resulting in a 500-level error response to the feeding client.

The error handling follows fail-fast semantics: when any operation in the batch fails, the entire batch processing is aborted and the error progress is returned immediately. Operations that were already processed successfully are discarded, and none of the batch's results are committed.

Usage

This error handling logic is an integral part of IndexingProcessor.process() and is invoked automatically for every document operation. It is not intended to be called independently.

Use this implementation reference when:

  • You need to understand how specific exceptions map to progress results in the indexing pipeline.
  • You are debugging error responses received by document feeding clients.
  • You want to understand the fail-fast behavior when a batch contains a problematic document.
  • You are implementing custom document processors and need to follow the same error handling conventions.

Code Reference

Source Location

  • Repository: Vespa
  • File: docprocs/src/main/java/com/yahoo/docprocs/indexing/IndexingProcessor.java
  • Lines: 115-139

Signature

// Error handling paths within:
@Override
public Progress process(Processing proc)

Import

import com.yahoo.docprocs.indexing.IndexingProcessor;
import com.yahoo.document.DocumentOperation;
import com.yahoo.docproc.DocumentProcessor.Progress;

Error Handling Code

for (var op : proc.getDocumentOperations()) {
    try {
        if (op instanceof DocumentPut dp) {
            processDocument(dp, out, deadline);
        } else if (op instanceof DocumentUpdate du) {
            processUpdate(du, out, deadline);
        } else if (op instanceof DocumentRemove dr) {
            processRemove(dr, out);
        } else if (op != null) {
            throw new IllegalArgumentException(
                "Document class " + op.getClass().getName() + " not supported.");
        } else {
            throw new IllegalArgumentException("Expected document, got null.");
        }
    } catch (InvalidInputException e) {
        return Progress.INVALID_INPUT.withReason(
            "Document '" + op.getId() + "': " + e.getMessage());
    } catch (OverloadException e) {
        return Progress.OVERLOAD.withReason(
            "Document '" + op.getId() + "': " + e.getMessage());
    } catch (TimeoutException e) {
        return Progress.TIMEOUT.withReason(
            "Document '" + op.getId() + "': " + e.getMessage());
    }
}

I/O Contract

Inputs

Name Type Required Description
proc Processing Yes The processing context containing document operations. Each operation is processed within the try-catch block that implements error classification.
op DocumentOperation Yes The individual document operation being processed. Its ID is included in error reason strings for diagnostic purposes.

Outputs

Name Type Description
Progress.DONE Progress Returned when all operations in the batch are processed successfully.
Progress.INVALID_INPUT Progress Returned when a document operation fails validation. Includes a reason string with the document ID and error message. Indicates a permanent failure that should not be retried.
Progress.OVERLOAD Progress Returned when processing fails due to resource constraints. Includes a reason string with the document ID and error message. Indicates a transient failure that should be retried with backoff.
Progress.TIMEOUT Progress Returned when processing exceeds the deadline. Includes a reason string with the document ID and error message. Indicates a transient failure that may be retried.

Exception-to-Progress Mapping

Exception Type Progress Result Retry Semantics Typical Cause
InvalidInputException Progress.INVALID_INPUT Do not retry Malformed document, undeclared fields, type mismatch
OverloadException Progress.OVERLOAD Retry with backoff Resource exhaustion, downstream throttling
TimeoutException Progress.TIMEOUT Retry with same or longer timeout Processing exceeded deadline, slow external services
IllegalArgumentException Propagates (uncaught) Do not retry Null operation or unsupported operation type
RuntimeException Propagates (uncaught) Depends on cause Bug in processing logic, infrastructure failure

Usage Examples

// Example: Handling the Progress result from IndexingProcessor

Processing processing = new Processing();
processing.getDocumentOperations().add(new DocumentPut(document));

Progress result = indexingProcessor.process(processing);

switch (result) {
    case Progress p when p == Progress.DONE:
        // Success: forward processed operations downstream
        forwardToContentLayer(processing.getDocumentOperations());
        break;

    case Progress p when p.equals(Progress.INVALID_INPUT):
        // Permanent failure: log and reject
        log.severe("Invalid input: " + p.getReason());
        rejectDocument(document.getId());
        break;

    case Progress p when p.equals(Progress.OVERLOAD):
        // Transient failure: back off and retry
        log.warning("Overload: " + p.getReason());
        scheduleRetryWithBackoff(processing);
        break;

    case Progress p when p.equals(Progress.TIMEOUT):
        // Transient failure: retry with possible adjustments
        log.warning("Timeout: " + p.getReason());
        scheduleRetryWithLongerTimeout(processing);
        break;
}

Related Pages

Implements Principle

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment