Implementation:Treeverse LakeFS S3 Client Setup
Appearance
| Knowledge Sources | |
|---|---|
| Domains | S3_Compatibility, REST_API |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
Wrapper for AWS SDK and Minio client initialization configured to communicate with the lakeFS S3 gateway.
Description
This implementation wraps the initialization of standard S3 client libraries (AWS SDK for Go v2, Minio Go client) so that they point at the lakeFS S3 gateway instead of AWS S3. The critical configuration parameters are:
- Endpoint: The lakeFS S3 gateway URL (same host as the lakeFS API, typically port 8000)
- Force path-style: Must be
true(lakeFS does not support virtual-hosted-style bucket addressing) - Credentials: lakeFS access key ID and secret access key
External dependencies:
github.com/minio/minio-go/v7-- Minio Go client librarygithub.com/aws/aws-sdk-go-v2/service/s3-- AWS SDK for Go v2
Usage
Use this implementation when:
- Setting up an S3 client in application code (Python boto3, Go AWS SDK, Java AWS SDK)
- Configuring a Minio client to interact with lakeFS
- Initializing Spark, Hive, or Presto with S3A filesystem settings pointing at lakeFS
Code Reference
Source Location
- File:
esti/s3_gateway_test.go - Lines: L47-71
- Functions:
newMinioClient(L47),createS3Client(L65)
Signature
// newMinioClient creates a Minio client configured for the lakeFS S3 gateway.
// getCredentials selects the signing method (V2 or V4).
func newMinioClient(t *testing.T, getCredentials GetCredentials) *minio.Client {
accessKeyID := viper.GetString("access_key_id")
secretAccessKey := viper.GetString("secret_access_key")
endpoint := viper.GetString("s3_endpoint")
endpointSecure := viper.GetBool("s3_endpoint_secure")
creds := getCredentials(accessKeyID, secretAccessKey, "")
clt, err := minio.New(endpoint, &minio.Options{
Creds: creds,
Secure: endpointSecure,
})
if err != nil {
t.Fatalf("minio.New: %s", err)
}
return clt
}
// createS3Client creates an AWS SDK v2 S3 client configured for lakeFS.
func createS3Client(endpoint string, t *testing.T) *s3.Client {
accessKeyID := viper.GetString("access_key_id")
secretAccessKey := viper.GetString("secret_access_key")
s3Client, err := testutil.SetupTestS3Client(endpoint, accessKeyID, secretAccessKey, true)
require.NoError(t, err, "failed creating s3 client")
return s3Client
}
Import
import boto3
# Python boto3 client pointing at the lakeFS S3 gateway
s3 = boto3.client(
's3',
endpoint_url='http://localhost:8000',
aws_access_key_id='AKIAIOSFDNN7EXAMPLEQ',
aws_secret_access_key='wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY',
)
I/O Contract
Inputs
| Parameter | Type | Required | Description |
|---|---|---|---|
endpoint_url |
string | Yes | The lakeFS S3 gateway URL (e.g., http://localhost:8000)
|
aws_access_key_id |
string | Yes | lakeFS access key ID |
aws_secret_access_key |
string | Yes | lakeFS secret access key |
force_path_style |
boolean | Yes | Must be true for lakeFS
|
region |
string | No | Any valid region string (lakeFS ignores this); defaults to us-east-1
|
Outputs
| Output | Type | Description |
|---|---|---|
| S3 client instance | Client object | Configured S3 client ready for use with lakeFS |
Usage Examples
Python boto3: Client setup
import boto3
s3 = boto3.client(
's3',
endpoint_url='http://localhost:8000',
aws_access_key_id='AKIAIOSFDNN7EXAMPLEQ',
aws_secret_access_key='wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY',
)
# Verify the connection by listing repositories (buckets)
response = s3.list_buckets()
for bucket in response['Buckets']:
print(bucket['Name'])
Python boto3: Resource-style setup
import boto3
s3_resource = boto3.resource(
's3',
endpoint_url='http://localhost:8000',
aws_access_key_id='AKIAIOSFDNN7EXAMPLEQ',
aws_secret_access_key='wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY',
)
bucket = s3_resource.Bucket('my-repo')
for obj in bucket.objects.filter(Prefix='main/'):
print(obj.key)
Spark: S3A filesystem configuration
from pyspark.sql import SparkSession
spark = SparkSession.builder \
.config("spark.hadoop.fs.s3a.endpoint", "http://localhost:8000") \
.config("spark.hadoop.fs.s3a.access.key", "AKIAIOSFDNN7EXAMPLEQ") \
.config("spark.hadoop.fs.s3a.secret.key", "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY") \
.config("spark.hadoop.fs.s3a.path.style.access", "true") \
.getOrCreate()
# Now Spark can read from lakeFS branches
df = spark.read.parquet("s3a://my-repo/main/data/")
AWS CLI: Configuration
# Configure AWS CLI profile for lakeFS
aws configure --profile lakefs
# AWS Access Key ID: AKIAIOSFDNN7EXAMPLEQ
# AWS Secret Access Key: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
# Default region name: us-east-1
# Use the profile with the lakeFS endpoint
aws --endpoint-url http://localhost:8000 --profile lakefs s3 ls s3://my-repo/main/
Related Pages
Implements Principle
Requires Environment
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment