Environment:Lance format Lance Cloud Storage Credentials
| Knowledge Sources | |
|---|---|
| Domains | Infrastructure, Cloud_Storage |
| Last Updated | 2026-02-08 19:00 GMT |
Overview
Cloud storage credential and configuration environment for Lance datasets on AWS S3, Google Cloud Storage, Azure Blob Storage, and other object stores.
Description
Lance supports reading and writing datasets to multiple cloud storage backends via the `object_store` crate (v0.12.3) and `opendal` (v0.55). Each backend requires specific environment variables or storage options for authentication and configuration. Lance also supports S3-compatible services (MinIO, LocalStack) and Hugging Face, Tencent, and Alibaba OSS storage.
Usage
This environment is required whenever a Lance dataset URI points to a cloud storage location (e.g., `s3://bucket/path`, `gs://bucket/path`, `az://container/path`). All read, write, scan, index, and optimization operations on cloud-hosted datasets require these credentials. Local filesystem datasets (`file:///path` or plain paths) do not require this environment.
System Requirements
| Category | Requirement | Notes |
|---|---|---|
| OS | Linux, macOS, or Windows | All platforms supported |
| Network | Internet access to cloud provider | Required for cloud storage operations |
| Disk | Minimal | Cloud storage is remote; local cache optional |
Dependencies
Rust Feature Flags
Lance cloud storage backends are controlled by Cargo feature flags (all enabled by default):
- `aws` — Amazon S3 support
- `gcp` — Google Cloud Storage support
- `azure` — Azure Blob Storage support
- `oss` — Alibaba Cloud OSS support
- `tencent` — Tencent Cloud COS support
- `huggingface` — Hugging Face Hub support
- `dynamodb` — DynamoDB-based commit locking for S3
System Packages
- `libssl-dev` — Required for TLS connections to cloud providers
Credentials
IMPORTANT: Never store actual secret values in code or documentation.
AWS S3
- `AWS_ACCESS_KEY_ID` — AWS access key for authentication
- `AWS_SECRET_ACCESS_KEY` — AWS secret key for authentication
- `AWS_SESSION_TOKEN` — Optional session token for temporary credentials
- `AWS_PROFILE` — AWS profile name for SSO or named profiles
- `AWS_DEFAULT_REGION` — AWS region (e.g., `us-east-1`)
- `AWS_ENDPOINT` — Custom S3-compatible endpoint URL
Google Cloud Storage
- `GOOGLE_SERVICE_ACCOUNT` — Path to service account JSON file
- `GOOGLE_APPLICATION_CREDENTIALS` — Path to application credentials JSON
- `HTTP1_ONLY` — Set to `false` to enable HTTP/2 (default is HTTP/1)
Azure Blob Storage
- `AZURE_STORAGE_ACCOUNT_NAME` — Storage account name
- `AZURE_STORAGE_ACCOUNT_KEY` — Storage account key
- `AZURE_STORAGE_ALLOW_HTTP` — Allow non-TLS HTTP connections
- `AZURE_STORAGE_USE_HTTP` — Use HTTP instead of HTTPS
Storage Options (Programmatic)
These can be passed as key-value pairs in Lance API calls:
- `allow_http` — Allow non-TLS connections
- `download_retry_count` — Retry count (default: 3)
- `allow_invalid_certificates` — Skip certificate validation
- `connect_timeout` — Connection timeout (default: 5s)
- `request_timeout` — Request timeout (default: 30s)
- `client_max_retries` — S3 client retry count (default: 10)
- `client_retry_timeout` — S3 client retry timeout (default: 180s)
Quick Install
# AWS S3 - set credentials
export AWS_ACCESS_KEY_ID="your-access-key"
export AWS_SECRET_ACCESS_KEY="your-secret-key"
export AWS_DEFAULT_REGION="us-east-1"
# Google Cloud Storage - set service account
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/service-account.json"
# Azure Blob Storage - set account credentials
export AZURE_STORAGE_ACCOUNT_NAME="your-account"
export AZURE_STORAGE_ACCOUNT_KEY="your-key"
# For local development with LocalStack (S3-compatible)
cd test_data && docker compose up -d
export AWS_ENDPOINT="http://localhost:4566"
export AWS_ACCESS_KEY_ID="ACCESS_KEY"
export AWS_SECRET_ACCESS_KEY="SECRET_KEY"
Code Evidence
Default feature flags enabling cloud storage from `rust/lance/Cargo.toml`:
[features]
default = ["aws", "azure", "gcp", "oss", "huggingface", "tencent"]
S3 retry defaults from storage options configuration:
// client_max_retries default: 10
// client_retry_timeout default: 180s
// connect_timeout default: 5s
// request_timeout default: 30s
Docker Compose test services from `docker-compose.yml:1-17`:
services:
localstack:
image: localstack/localstack:4.0
environment:
- SERVICES=s3,dynamodb,kms
- AWS_ACCESS_KEY_ID=ACCESS_KEY
- AWS_SECRET_ACCESS_KEY=SECRET_KEY
ports:
- "4566:4566"
Common Errors
| Error Message | Cause | Solution |
|---|---|---|
| `NoCredentialProviders: no valid providers in chain` | AWS credentials not configured | Set `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` |
| `Connection refused` on S3 operations | LocalStack not running or wrong endpoint | Run `docker compose up -d` in `test_data/` and set `AWS_ENDPOINT` |
| `InvalidSignature` | Credentials mismatch or clock skew | Verify credentials and system clock synchronization |
| `SSL certificate problem` | Self-signed cert in development | Set `allow_invalid_certificates` storage option |
Compatibility Notes
- DynamoDB Commit Lock: For concurrent S3 writers, enable the `dynamodb` feature and configure a DynamoDB table for commit coordination.
- S3-Compatible Services: MinIO, LocalStack, and other S3-compatible services work via the `AWS_ENDPOINT` environment variable.
- HTTP/2 for GCS: Google Cloud Storage defaults to HTTP/1.1. Set `HTTP1_ONLY=false` to enable HTTP/2.