Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Treeverse LakeFS ImportStatus

From Leeroopedia


Knowledge Sources
Domains Data_Import, REST_API
Last Updated 2026-02-08 00:00 GMT

Overview

Concrete API endpoint for querying the current status and progress of an asynchronous import operation, provided by the lakeFS REST API.

Description

The importStatus endpoint allows clients to poll the progress of a previously initiated import job. Given the import job ID (obtained from importStart), it returns a status object that includes:

  • Whether the import has completed
  • The number of objects ingested so far
  • The last update timestamp for detecting stalls
  • On successful completion: the resulting commit object and metarange ID
  • On failure: an error object with details

This endpoint is designed to be called repeatedly in a polling loop until the completed field becomes true.

Usage

Use this endpoint when:

  • Polling for completion after calling importStart
  • Building progress indicators for import operations in CLIs, dashboards, or web UIs
  • Implementing timeout and cancellation logic in automated import pipelines
  • Retrieving the commit reference created by a successful import for subsequent operations (tagging, branching, verification)

Code Reference

Source Location

  • Repository: lakeFS
  • File: api/swagger.yml (lines 5523-5551)

Signature

/repositories/{repository}/branches/{branch}/import:
  get:
    tags:
      - import
    operationId: importStatus
    summary: get import status
    parameters:
      - in: query
        name: id
        description: Unique identifier of the import process
        schema:
          type: string
        required: true
    responses:
      200:
        description: import status
        content:
          application/json:
            schema:
              $ref: "#/components/schemas/ImportStatus"
      400:
        $ref: "#/components/responses/BadRequest"
      401:
        $ref: "#/components/responses/Unauthorized"
      404:
        $ref: "#/components/responses/NotFound"
      429:
        description: too many requests

Import

import lakefs
import time

client = lakefs.Client(
    host="http://localhost:8000",
    username="access_key",
    password="secret_key"
)
repo = lakefs.Repository("my-repo", client=client)
branch = repo.branch("main")

I/O Contract

Inputs

Name Type Required Description
repository string (path) Yes Repository name
branch string (path) Yes Branch name where the import is running
id string (query) Yes Unique identifier of the import process, returned by importStart

Outputs

Name Type Description
completed boolean Whether the import has finished (true on both success and failure)
update_time date-time Timestamp of the last status update; compare across polls to detect stalls
ingested_objects int64 Number of objects processed so far; monotonically increasing during the import
metarange_id string ID of the constructed metarange; set upon successful completion
commit Commit The commit object created by the import; set upon successful completion. Contains id, message, creation_date, metadata, etc.
error Error Error details if the import failed; null on success

HTTP Status Codes:

Code Description
200 Success -- returns the current ImportStatus object
400 Bad request -- invalid or missing import ID
401 Unauthorized -- missing or invalid credentials
404 Not found -- repository, branch, or import ID does not exist
429 Too many requests -- rate limited

Usage Examples

Poll Import Status with curl

# Poll for import status using the job ID from importStart
IMPORT_ID="c7a300b8-4a20-4e3b-a3b5-2ef4f2e7d0a1"
REPO="my-repo"
BRANCH="main"

curl -s \
  "http://localhost:8000/api/v1/repositories/${REPO}/branches/${BRANCH}/import?id=${IMPORT_ID}" \
  -u "access_key:secret_key"

# Response (in progress):
# {
#   "completed": false,
#   "update_time": "2024-01-15T10:30:45Z",
#   "ingested_objects": 42850
# }

# Response (completed):
# {
#   "completed": true,
#   "update_time": "2024-01-15T10:35:12Z",
#   "ingested_objects": 128000,
#   "metarange_id": "480e19972a6fbe98ab8e81ae5efdfd1a29037587e91244e87abd4adefffdb01c",
#   "commit": {
#     "id": "a1b2c3d4e5f6...",
#     "message": "Import production collections from S3",
#     "creation_date": 1705312512
#   }
# }

Polling Loop in Python

import requests
import time

LAKEFS_URL = "http://localhost:8000/api/v1"
AUTH = ("access_key", "secret_key")
REPO = "my-repo"
BRANCH = "main"
IMPORT_ID = "c7a300b8-4a20-4e3b-a3b5-2ef4f2e7d0a1"

polling_interval = 2  # seconds
previous_update_time = None

while True:
    time.sleep(polling_interval)
    resp = requests.get(
        f"{LAKEFS_URL}/repositories/{REPO}/branches/{BRANCH}/import",
        params={"id": IMPORT_ID},
        auth=AUTH,
    )
    resp.raise_for_status()
    status = resp.json()

    # Check for errors
    if status.get("error"):
        raise RuntimeError(f"Import failed: {status['error']}")

    # Detect stalls
    current_update_time = status["update_time"]
    if current_update_time == previous_update_time:
        print("WARNING: Import may be stalled")
    previous_update_time = current_update_time

    # Log progress
    ingested = status.get("ingested_objects", 0)
    print(f"Import progress: {ingested} objects ingested")

    # Check completion
    if status["completed"]:
        commit = status["commit"]
        print(f"Import completed. Commit ID: {commit['id']}")
        break

Polling Loop in Bash

#!/bin/bash
IMPORT_ID="c7a300b8-4a20-4e3b-a3b5-2ef4f2e7d0a1"
REPO="my-repo"
BRANCH="main"

while true; do
    sleep 2
    STATUS=$(curl -s \
        "http://localhost:8000/api/v1/repositories/${REPO}/branches/${BRANCH}/import?id=${IMPORT_ID}" \
        -u "access_key:secret_key")

    COMPLETED=$(echo "$STATUS" | jq -r '.completed')
    INGESTED=$(echo "$STATUS" | jq -r '.ingested_objects // 0')
    echo "Progress: ${INGESTED} objects ingested"

    if [ "$COMPLETED" = "true" ]; then
        COMMIT_ID=$(echo "$STATUS" | jq -r '.commit.id')
        echo "Import completed. Commit: ${COMMIT_ID}"
        break
    fi
done

Related Pages

Implements Principle

Requires Environment

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment