Implementation:Kubeflow Pipelines Kubernetes PVC Sample
| Knowledge Sources | |
|---|---|
| Domains | Pipeline_Sample, Kubernetes, Storage |
| Last Updated | 2026-02-13 14:00 GMT |
Overview
Sample pipeline demonstrating the full lifecycle of Kubernetes PersistentVolumeClaim usage within a KFP v2 pipeline using the `kfp-kubernetes` plugin.
Description
This sample (62 lines) demonstrates creating, mounting, and deleting a PVC within a pipeline. Two components are defined: `make_data` writes to a volume mount, `read_data` reads from it. The pipeline creates a 5Gi PVC with ReadWriteOnce access via `kubernetes.CreatePVC`, mounts it to both tasks at different paths using `kubernetes.mount_pvc`, enforces ordering via `.after()`, and cleans up with `kubernetes.DeletePVC`.
Usage
Reference this sample when building pipelines that require volume-based data sharing between steps, especially for large datasets where artifact-based passing is impractical.
Code Reference
Source Location
- Repository: Kubeflow_Pipelines
- File: samples/core/kubernetes_pvc/kubernetes_pvc.py
- Lines: 1-62
Signature
@dsl.component
def make_data():
"""Writes data to /data/file.txt on the mounted PVC."""
@dsl.component
def read_data():
"""Reads data from /reused_data/file.txt on the mounted PVC."""
@dsl.pipeline
def my_pipeline():
"""Pipeline: CreatePVC -> make_data -> read_data -> DeletePVC."""
Import
from kfp import dsl
from kfp import kubernetes
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| PVC size | string | Yes | Storage size (default: "5Gi") |
| Access mode | string | Yes | PVC access mode (default: "ReadWriteOnce") |
Outputs
| Name | Type | Description |
|---|---|---|
| Compiled YAML | file | Pipeline IR YAML for submission to KFP |
Usage Examples
Full Pipeline Code
from kfp import dsl
from kfp import kubernetes
@dsl.component
def make_data():
with open("/data/file.txt", "w") as f:
f.write("Hello from make_data")
@dsl.component
def read_data():
with open("/reused_data/file.txt", "r") as f:
print(f.read())
@dsl.pipeline
def my_pipeline():
pvc = kubernetes.CreatePVC(
pvc_name_suffix="-my-pvc",
size="5Gi",
access_modes=["ReadWriteOnce"],
)
task1 = make_data()
kubernetes.mount_pvc(task1, pvc_name=pvc.outputs["name"], mount_path="/data")
task2 = read_data().after(task1)
kubernetes.mount_pvc(task2, pvc_name=pvc.outputs["name"], mount_path="/reused_data")
kubernetes.DeletePVC(pvc_name=pvc.outputs["name"]).after(task2)