Implementation:Treeverse LakeFS UploadObject
Appearance
| Knowledge Sources | |
|---|---|
| Domains | Data_Version_Control, REST_API |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
Concrete tool for uploading data objects to a branch in a lakeFS repository provided by the lakeFS REST API.
Description
The uploadObject endpoint writes an object to a specified branch in a repository. The object is placed in a staged (uncommitted) state and becomes part of the version history only after a subsequent commit. The endpoint supports both multipart/form-data and application/octet-stream content types for flexible upload workflows. Conditional headers (If-None-Match, If-Match) enable safe concurrent writes.
Usage
Use this API when:
- Ingesting new data files into a versioned branch for subsequent commit.
- Updating existing objects on a branch with new content.
- Uploading model artifacts, configuration files, or metadata alongside data.
- Performing conditional writes to prevent accidental overwrites in concurrent pipelines.
Code Reference
Source Location
- Repository: lakeFS
- File: api/swagger.yml (lines 5700-5787)
Signature
/repositories/{repository}/branches/{branch}/objects:
post:
operationId: uploadObject
summary: upload object
parameters:
- in: path
name: repository
required: true
schema:
type: string
- in: path
name: branch
required: true
schema:
type: string
- in: query
name: path
required: true
schema:
type: string
- in: query
name: storageClass
schema:
type: string
deprecated: true
- in: query
name: force
schema:
type: boolean
default: false
- in: header
name: If-None-Match
schema:
type: string
description: "Set to '*' to atomically create an object only if it does not exist"
- in: header
name: If-Match
schema:
type: string
description: "ETag of the object to conditionally update"
requestBody:
content:
multipart/form-data:
schema:
type: object
properties:
content:
type: string
format: binary
application/octet-stream:
schema:
type: string
format: binary
responses:
201:
description: object metadata
content:
application/json:
schema:
$ref: "#/components/schemas/ObjectStats"
Import
import lakefs
client = lakefs.Client(
host="http://localhost:8000",
username="access_key_id",
password="secret_access_key"
)
repo = lakefs.Repository("my-repo", client=client)
branch = repo.branch("main")
obj = branch.object("data/my-file.csv").upload(data=b"col1,col2\nval1,val2")
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| repository (path param) | string | Yes | Repository name. |
| branch (path param) | string | Yes | Branch name to upload the object to. |
| path (query param) | string | Yes | Object path within the branch namespace. |
| storageClass (query param) | string | No | (Deprecated) Storage class for the object. |
| force (query param) | boolean | No | Force upload even if object exists. Defaults to false.
|
| If-None-Match (header) | string | No | Set to * for conditional create (only if object does not exist).
|
| If-Match (header) | string | No | ETag value for conditional update (only if existing object matches). |
| content (body) | binary | Yes | The object data to upload. |
Outputs
| Name | Type | Description |
|---|---|---|
| path | string | Logical path of the uploaded object. |
| path_type | string | Type of path (object or common_prefix). |
| physical_address | string | Physical location of the object in the underlying object store. |
| physical_address_expiry | integer | Expiry time for the physical address (if pre-signed). |
| checksum | string | Checksum (ETag) of the uploaded object. |
| mtime | integer (int64) | Modification time as Unix epoch seconds. |
| size_bytes | integer (int64) | Size of the uploaded object in bytes. |
| metadata | map[string]string | User-defined metadata key-value pairs. |
| content_type | string | MIME content type of the object. |
Usage Examples
Upload a File Using the Python SDK
import lakefs
client = lakefs.Client(
host="http://localhost:8000",
username="AKIAIOSFODNN7EXAMPLE",
password="wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
)
repo = lakefs.Repository("my-data-repo", client=client)
branch = repo.branch("experiment-v2")
# Upload a CSV file from a string
obj = branch.object("data/customers/records.csv").upload(
data=b"id,name,email\n1,Alice,alice@example.com\n2,Bob,bob@example.com",
content_type="text/csv"
)
print(f"Uploaded: {obj.path}, size: {obj.size_bytes} bytes")
Upload a File Using curl
curl -X POST "http://localhost:8000/api/v1/repositories/my-data-repo/branches/experiment-v2/objects?path=data/customers/records.csv" \
-H "Content-Type: application/octet-stream" \
-u "AKIAIOSFODNN7EXAMPLE:wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY" \
--data-binary @records.csv
Conditional Upload (Create Only If Not Exists)
curl -X POST "http://localhost:8000/api/v1/repositories/my-data-repo/branches/main/objects?path=data/config.json" \
-H "Content-Type: application/octet-stream" \
-H "If-None-Match: *" \
-u "AKIAIOSFODNN7EXAMPLE:wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY" \
--data-binary @config.json
Related Pages
Implements Principle
Requires Environment
Uses Heuristic
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment