Implementation:Treeverse LakeFS S3 GetObject

Knowledge Sources	lakeFS lakeFS S3 Gateway Docs
Domains	S3_Compatibility, REST_API
Last Updated	2026-02-08 00:00 GMT

Overview

Wrapper for standard S3 read operations (GetObject, HeadObject, ListObjectsV2, presigned URLs) via the lakeFS S3 gateway.

Description

This implementation wraps S3 read operations that are translated by the lakeFS S3 gateway into lakeFS object retrieval operations. It covers the complete set of read-path interactions:

GetObject -- Retrieve object content and metadata
HeadObject / StatObject -- Retrieve metadata without body
ListObjectsV2 -- List objects under a branch prefix
Presigned URL generation -- Create time-limited download URLs

The gateway supports conditional requests (If-Match, If-None-Match based on ETag), range requests for partial reads, and presigned URLs for delegated access.

Usage

Use this implementation when:

Reading data from lakeFS through S3-compatible tools
Checking object existence or metadata without downloading content
Listing objects on a specific branch
Generating presigned URLs for temporary read access

Code Reference

Source Location

File: esti/s3_gateway_test.go
Lines: L516-691 (TestS3ReadObject)
File: esti/presign_test.go
Lines: L1-209 (presigned URL tests)

Signature

// Minio client: GetObject
res, err := minioClient.GetObject(ctx, repo, "main/exists", minio.GetObjectOptions{})
defer res.Close()
info, err := res.Stat()
content, err := io.ReadAll(res)

// Minio client: StatObject (HeadObject)
info, err := minioClient.StatObject(ctx, repo, "main/exists", minio.StatObjectOptions{})
// info.Size, info.ETag, info.ContentType, info.UserMetadata

// Minio client: Presigned GET URL
preSignedURL, err := minioClient.Presign(ctx, http.MethodGet, repo, "main/exists",
    time.Second*60, url.Values{})

// AWS SDK v2: GetObject
output, err := s3Client.GetObject(ctx, &s3.GetObjectInput{
    Bucket: aws.String(repo),
    Key:    aws.String("main/data/file.csv"),
})
defer output.Body.Close()
content, err := io.ReadAll(output.Body)

// AWS SDK v2: HeadObject
head, err := s3Client.HeadObject(ctx, &s3.HeadObjectInput{
    Bucket: aws.String(repo),
    Key:    aws.String("main/data/file.csv"),
})

// AWS SDK v2: ListObjectsV2
list, err := s3Client.ListObjectsV2(ctx, &s3.ListObjectsV2Input{
    Bucket: aws.String(repo),
    Prefix: aws.String("main/data/"),
})

Import

import boto3

s3 = boto3.client('s3',
    endpoint_url='http://localhost:8000',
    aws_access_key_id='AKIAIOSFDNN7EXAMPLEQ',
    aws_secret_access_key='wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY',
)

I/O Contract

Inputs

Parameter	Type	Required	Description
`Bucket`	string	Yes	lakeFS repository name
`Key`	string	Yes	Object key in format `{branch}/{path}`
`If-Match`	string	No	Return object only if ETag matches (conditional read)
`If-None-Match`	string	No	Return object only if ETag does not match (conditional read)
`Range`	string	No	Byte range for partial read (e.g., `bytes=0-1023`)
`Prefix`	string	No	Key prefix for ListObjectsV2 (e.g., `main/data/`)
`MaxKeys`	integer	No	Maximum number of keys to return in list (default 1000)
`ContinuationToken`	string	No	Pagination token for ListObjectsV2

Outputs

Output	Type	Description
Body	byte stream	Object content (GetObject only)
ETag	string	Entity tag (MD5 hash of content)
Content-Type	string	MIME type of the object
Content-Length	integer	Size in bytes
User metadata	map[string]string	User-defined metadata key-value pairs
Contents (list)	array	Array of object summaries (ListObjectsV2)
Presigned URL	string	Time-limited URL for direct download

Usage Examples

Python boto3: Get object content

import boto3

s3 = boto3.client('s3',
    endpoint_url='http://localhost:8000',
    aws_access_key_id='AKIAIOSFDNN7EXAMPLEQ',
    aws_secret_access_key='wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY',
)

# Download an object from the main branch
response = s3.get_object(Bucket='my-repo', Key='main/data/file.csv')
content = response['Body'].read().decode('utf-8')
print(content)

Python boto3: Head object (check existence and metadata)

# Check if an object exists and get its metadata
try:
    head = s3.head_object(Bucket='my-repo', Key='main/data/file.csv')
    print(f"Size: {head['ContentLength']} bytes")
    print(f"ETag: {head['ETag']}")
    print(f"Content-Type: {head['ContentType']}")
    if 'Metadata' in head:
        print(f"User metadata: {head['Metadata']}")
except s3.exceptions.ClientError as e:
    if e.response['Error']['Code'] == '404':
        print("Object does not exist")

Python boto3: List objects on a branch

# List all objects under main/data/
paginator = s3.get_paginator('list_objects_v2')
for page in paginator.paginate(Bucket='my-repo', Prefix='main/data/'):
    for obj in page.get('Contents', []):
        print(f"{obj['Key']}  ({obj['Size']} bytes)")

Python boto3: Generate presigned URL

# Generate a presigned URL valid for 1 hour
url = s3.generate_presigned_url(
    'get_object',
    Params={'Bucket': 'my-repo', 'Key': 'main/data/file.csv'},
    ExpiresIn=3600
)
print(f"Download URL: {url}")

AWS CLI: Download and list

# Download a file
aws --endpoint-url http://localhost:8000 s3 cp \
    s3://my-repo/main/data/file.csv ./file.csv

# List objects on a branch
aws --endpoint-url http://localhost:8000 s3 ls s3://my-repo/main/data/

# Get object metadata
aws --endpoint-url http://localhost:8000 s3api head-object \
    --bucket my-repo --key main/data/file.csv

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment