Implementation:Treeverse LakeFS S3 Client Setup

Knowledge Sources	lakeFS lakeFS S3 Gateway Docs
Domains	S3_Compatibility, REST_API
Last Updated	2026-02-08 00:00 GMT

Overview

Wrapper for AWS SDK and Minio client initialization configured to communicate with the lakeFS S3 gateway.

Description

This implementation wraps the initialization of standard S3 client libraries (AWS SDK for Go v2, Minio Go client) so that they point at the lakeFS S3 gateway instead of AWS S3. The critical configuration parameters are:

Endpoint: The lakeFS S3 gateway URL (same host as the lakeFS API, typically port 8000)
Force path-style: Must be true (lakeFS does not support virtual-hosted-style bucket addressing)
Credentials: lakeFS access key ID and secret access key

External dependencies:

github.com/minio/minio-go/v7 -- Minio Go client library
github.com/aws/aws-sdk-go-v2/service/s3 -- AWS SDK for Go v2

Usage

Use this implementation when:

Setting up an S3 client in application code (Python boto3, Go AWS SDK, Java AWS SDK)
Configuring a Minio client to interact with lakeFS
Initializing Spark, Hive, or Presto with S3A filesystem settings pointing at lakeFS

Code Reference

Source Location

File: esti/s3_gateway_test.go
Lines: L47-71
Functions: newMinioClient (L47), createS3Client (L65)

Signature

// newMinioClient creates a Minio client configured for the lakeFS S3 gateway.
// getCredentials selects the signing method (V2 or V4).
func newMinioClient(t *testing.T, getCredentials GetCredentials) *minio.Client {
    accessKeyID := viper.GetString("access_key_id")
    secretAccessKey := viper.GetString("secret_access_key")
    endpoint := viper.GetString("s3_endpoint")
    endpointSecure := viper.GetBool("s3_endpoint_secure")
    creds := getCredentials(accessKeyID, secretAccessKey, "")
    clt, err := minio.New(endpoint, &minio.Options{
        Creds:  creds,
        Secure: endpointSecure,
    })
    if err != nil {
        t.Fatalf("minio.New: %s", err)
    }
    return clt
}

// createS3Client creates an AWS SDK v2 S3 client configured for lakeFS.
func createS3Client(endpoint string, t *testing.T) *s3.Client {
    accessKeyID := viper.GetString("access_key_id")
    secretAccessKey := viper.GetString("secret_access_key")
    s3Client, err := testutil.SetupTestS3Client(endpoint, accessKeyID, secretAccessKey, true)
    require.NoError(t, err, "failed creating s3 client")
    return s3Client
}

Import

import boto3

# Python boto3 client pointing at the lakeFS S3 gateway
s3 = boto3.client(
    's3',
    endpoint_url='http://localhost:8000',
    aws_access_key_id='AKIAIOSFDNN7EXAMPLEQ',
    aws_secret_access_key='wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY',
)

I/O Contract

Inputs

Parameter	Type	Required	Description
`endpoint_url`	string	Yes	The lakeFS S3 gateway URL (e.g., `http://localhost:8000`)
`aws_access_key_id`	string	Yes	lakeFS access key ID
`aws_secret_access_key`	string	Yes	lakeFS secret access key
`force_path_style`	boolean	Yes	Must be `true` for lakeFS
`region`	string	No	Any valid region string (lakeFS ignores this); defaults to `us-east-1`

Outputs

Output	Type	Description
S3 client instance	Client object	Configured S3 client ready for use with lakeFS

Usage Examples

Python boto3: Client setup

import boto3

s3 = boto3.client(
    's3',
    endpoint_url='http://localhost:8000',
    aws_access_key_id='AKIAIOSFDNN7EXAMPLEQ',
    aws_secret_access_key='wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY',
)

# Verify the connection by listing repositories (buckets)
response = s3.list_buckets()
for bucket in response['Buckets']:
    print(bucket['Name'])

Python boto3: Resource-style setup

import boto3

s3_resource = boto3.resource(
    's3',
    endpoint_url='http://localhost:8000',
    aws_access_key_id='AKIAIOSFDNN7EXAMPLEQ',
    aws_secret_access_key='wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY',
)

bucket = s3_resource.Bucket('my-repo')
for obj in bucket.objects.filter(Prefix='main/'):
    print(obj.key)

Spark: S3A filesystem configuration

from pyspark.sql import SparkSession

spark = SparkSession.builder \
    .config("spark.hadoop.fs.s3a.endpoint", "http://localhost:8000") \
    .config("spark.hadoop.fs.s3a.access.key", "AKIAIOSFDNN7EXAMPLEQ") \
    .config("spark.hadoop.fs.s3a.secret.key", "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY") \
    .config("spark.hadoop.fs.s3a.path.style.access", "true") \
    .getOrCreate()

# Now Spark can read from lakeFS branches
df = spark.read.parquet("s3a://my-repo/main/data/")

AWS CLI: Configuration

# Configure AWS CLI profile for lakeFS
aws configure --profile lakefs
# AWS Access Key ID: AKIAIOSFDNN7EXAMPLEQ
# AWS Secret Access Key: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
# Default region name: us-east-1

# Use the profile with the lakeFS endpoint
aws --endpoint-url http://localhost:8000 --profile lakefs s3 ls s3://my-repo/main/

Related Pages

Implements Principle

Principle:Treeverse_LakeFS_S3_Gateway_Configuration

Requires Environment

Environment:Treeverse_LakeFS_S3_Gateway_Test_Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment