Implementation:Spotify Luigi CascadingClient
Overview
CascadingClient is a filesystem client abstraction in the luigi.contrib.target module that implements a fallback chain pattern for filesystem operations. When a filesystem method is called, it tries each configured client in order. If a client raises an exception (other than a FileSystemException), the call cascades to the next client in the chain. This enables transparent failover between different filesystem backends (e.g., HDFS to local filesystem).
Source Location
| Property | Value |
|---|---|
| Source File | luigi/contrib/target.py
|
| Lines of Code | 75 |
| Module | luigi.contrib.target
|
| Domain | File_System, Abstraction |
Import Statement
from luigi.contrib.target import CascadingClient
Class: CascadingClient
CascadingClient
A filesystem client that cascades failing method calls through an ordered list of client backends.
Class Constants
| Constant | Type | Description |
|---|---|---|
ALL_METHOD_NAMES |
list |
The default set of filesystem method names that are proxied through the cascade chain. |
The full list of methods in ALL_METHOD_NAMES:
['exists', 'rename', 'remove', 'chmod', 'chown', 'count', 'copy', 'get', 'put', 'mkdir', 'list', 'listdir', 'getmerge', 'isdir', 'rename_dont_move', 'touchz']
Constructor
CascadingClient.__init__(self, clients, method_names=None)
| Parameter | Type | Default | Description |
|---|---|---|---|
clients |
list |
(required) | An ordered list of filesystem client objects. Methods are tried on each client in order. |
method_names |
list or None |
None (uses ALL_METHOD_NAMES) |
Optional list of method names to create on this instance. If None, all methods from ALL_METHOD_NAMES are created.
|
During construction, for each method name in the list, a bound method is dynamically created on the instance using types.MethodType. Each generated method delegates to _chained_call().
Methods
| Method | Signature | Description |
|---|---|---|
_chained_call |
_chained_call(self, method_name, *args, **kwargs) |
Iterates through self.clients and calls getattr(client, method_name)(*args, **kwargs) on each. Returns the result from the first client that succeeds. Raises FileSystemException immediately (semantical errors must propagate). For all other exceptions, logs a warning and tries the next client. If the last client also fails, the exception is re-raised.
|
_make_method |
_make_method(cls, method_name) (classmethod) |
Factory that creates a function which delegates to _chained_call with the given method name.
|
Dynamically Generated Methods
After construction, the instance has the following methods (assuming default method_names):
| Method | Signature | Description |
|---|---|---|
exists |
exists(self, *args, **kwargs) |
Check if a path exists. |
rename |
rename(self, *args, **kwargs) |
Rename a path. |
remove |
remove(self, *args, **kwargs) |
Remove a path. |
chmod |
chmod(self, *args, **kwargs) |
Change file permissions. |
chown |
chown(self, *args, **kwargs) |
Change file ownership. |
count |
count(self, *args, **kwargs) |
Count items at a path. |
copy |
copy(self, *args, **kwargs) |
Copy a path. |
get |
get(self, *args, **kwargs) |
Get/download a file. |
put |
put(self, *args, **kwargs) |
Put/upload a file. |
mkdir |
mkdir(self, *args, **kwargs) |
Create a directory. |
list |
list(self, *args, **kwargs) |
List directory contents. |
listdir |
listdir(self, *args, **kwargs) |
List directory entries. |
getmerge |
getmerge(self, *args, **kwargs) |
Merge and download files. |
isdir |
isdir(self, *args, **kwargs) |
Check if path is a directory. |
rename_dont_move |
rename_dont_move(self, *args, **kwargs) |
Rename without moving into destination directory. |
touchz |
touchz(self, *args, **kwargs) |
Create an empty file. |
Cascading Behavior
The cascading logic in _chained_call follows these rules:
- For each client in order, attempt the method call.
- If the call succeeds, return the result immediately.
- If a
FileSystemExceptionis raised, re-raise it immediately. These are semantic errors (e.g., "file not found") that should not be silently retried. - If any other exception is raised:
- If there are more clients to try, log a warning and continue to the next client.
- If this was the last client, re-raise the exception.
Usage Example
from luigi.contrib.target import CascadingClient
# Create a cascading client that tries HDFS first, then falls back to local filesystem
hdfs_client = HdfsClient()
local_client = LocalFileSystem()
client = CascadingClient([hdfs_client, local_client])
# This will try hdfs_client.exists() first; if it fails, tries local_client.exists()
if client.exists('/data/output/result.csv'):
print("File found")
# Only proxy specific methods
limited_client = CascadingClient(
[hdfs_client, local_client],
method_names=['exists', 'get', 'put']
)
Dependencies
- Python standard library:
logging,types.MethodType - Luigi core:
luigi.target.FileSystemException
Related Principles
See Also
luigi.target.FileSystem- Base filesystem interfaceluigi.target.FileSystemException- Exception type that short-circuits cascading