Environment:Run llama Llama index Fsspec Remote Storage
| Knowledge Sources | |
|---|---|
| Domains | Infrastructure, Storage |
| Last Updated | 2026-02-11 19:00 GMT |
Overview
Fsspec-based remote filesystem environment for persisting LlamaIndex storage contexts and ingestion pipeline state to cloud storage (S3, GCS, Azure Blob).
Description
LlamaIndex uses fsspec (filesystem specification) as an abstraction layer for file I/O operations. This allows `StorageContext.persist()` and `IngestionPipeline.persist()` to write to remote filesystems (S3, GCS, Azure Blob Storage, etc.) transparently. The fsspec dependency is included in the core package, but protocol-specific implementations (like `s3fs` for S3) must be installed separately.
Usage
Use this environment when you need to persist or load index data, storage contexts, or pipeline state to/from remote cloud storage instead of the local filesystem. Required when deploying LlamaIndex in cloud environments or when sharing indexes across machines.
System Requirements
| Category | Requirement | Notes |
|---|---|---|
| Network | Access to target cloud storage | S3, GCS, or Azure endpoints |
| Account | Cloud provider credentials | Provider-specific auth |
Dependencies
Python Packages
- `fsspec` >= 2023.5.0 (included in llama-index-core)
- `s3fs` (for Amazon S3)
- `gcsfs` (for Google Cloud Storage)
- `adlfs` (for Azure Blob/Data Lake)
Credentials
Credentials depend on the target storage backend:
- `AWS_ACCESS_KEY_ID` / `AWS_SECRET_ACCESS_KEY`: For S3 storage via s3fs
- `GOOGLE_APPLICATION_CREDENTIALS`: For GCS storage via gcsfs
- `AZURE_STORAGE_CONNECTION_STRING`: For Azure Blob via adlfs
Quick Install
# For S3 support
pip install s3fs
# For Google Cloud Storage
pip install gcsfs
# For Azure Blob Storage
pip install adlfs
Code Evidence
Fsspec dependency from `pyproject.toml:60`:
"fsspec>=2023.5.0",
StorageContext uses fsspec for persist/load operations in `storage/storage_context.py`, enabling transparent remote filesystem support through the `fs` parameter.
Common Errors
| Error Message | Cause | Solution |
|---|---|---|
| `ModuleNotFoundError: No module named 's3fs'` | S3 filesystem not installed | `pip install s3fs` |
| `NoCredentialsError` | AWS credentials not configured | Set `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` |
| `FileNotFoundError` on remote path | Bucket/container does not exist | Create the target bucket/container first |
Compatibility Notes
- Local Fallback: All persist/load operations default to local filesystem when no fsspec filesystem is specified.
- Protocol Detection: Fsspec auto-detects protocols from URL schemes (e.g., `s3://`, `gs://`, `abfs://`).
- Windows: Ingestion pipeline notes "doesn't support Windows here" for certain filesystem path operations (pipeline.py:326).