Implementation:Kubeflow Pipelines Dsl Component Decorator
| Knowledge Sources | |
|---|---|
| Domains | |
| Last Updated | 2026-02-13 00:00 GMT |
Overview
Concrete tool for defining pipeline components as decorated Python functions or container specs provided by the KFP SDK. The @dsl.component and @dsl.container_component decorators are the primary entry points for authoring Kubeflow pipeline steps.
Description
The @dsl.component decorator transforms a Python function into a lightweight KFP component. The function body is serialized and executed inside a container at runtime. Dependencies beyond the base image are specified via the packages_to_install parameter.
The @dsl.container_component decorator defines a component that runs a custom container image. Instead of serializing the function body, the decorated function returns a dsl.ContainerSpec that specifies the image, command, and arguments.
Both decorators produce component specifications with typed I/O that the KFP compiler uses to build the pipeline graph (Argo Workflow or KFP v2 IR).
Usage
Import when defining pipeline steps:
from kfp import dsl
from kfp.dsl import Input, InputPath, Output, OutputPath, Dataset, Model, component
- Use
@dsl.componentfor pure Python logic that can be expressed in a self-contained function. - Use
@dsl.container_componentfor container-based logic requiring custom images or non-Python tooling.
Code Reference
Source Location
- Repository:
kubeflow/pipelines - File:
samples/tutorials/Data passing in python components/Data passing in python components - Files.py(lines 40--128 for component examples) - File:
samples/core/sequential/sequential.py(lines 20--35 for container components)
Signature
@dsl.component(
base_image: str = 'python:3.9',
packages_to_install: List[str] = None,
kfp_package_path: str = None,
)
def component_func(
# Input parameters
param: str,
input_artifact: Input[Dataset],
input_path: InputPath('Dataset'),
# Output parameters
output_artifact: Output[Dataset],
output_path: OutputPath('Dataset'),
) -> Optional[ReturnType]:
...
@dsl.container_component
def container_func(
param: str,
output: dsl.OutputPath(str),
) -> dsl.ContainerSpec:
return dsl.ContainerSpec(
image='...',
command=['...'],
args=['...'],
)
Import
from kfp import dsl
from kfp.dsl import Input, InputPath, Output, OutputPath, Dataset, Model, component
I/O Contract
Inputs
| Name | Type | Description |
|---|---|---|
base_image |
str (decorator param) | Base container image for the component. Defaults to 'python:3.9'.
|
packages_to_install |
List[str] (decorator param) | Python packages to pip-install at runtime before executing the function body. |
kfp_package_path |
str (decorator param) | Optional override path to the KFP SDK package. |
| Function parameters | Typed via hints | Each function parameter becomes a component input. Primitives (str, int, float, bool) are passed inline. Artifacts are passed via Input[T] or InputPath('T').
|
Outputs
| Name | Type | Description |
|---|---|---|
| ComponentSpec | Pipeline graph node | At compile time, the decorator produces a component specification embedded in the pipeline IR (Intermediate Representation). |
| Return value | Typed via return hint | If the function has a return type annotation, the returned value becomes a named output parameter. |
| Artifacts | Output[T] / OutputPath | At runtime, artifacts are written to the path provided by the Output[T].path handle or the OutputPath('T') string, and are automatically tracked by the metadata store.
|
Usage Examples
Example 1: Python Component with Typed Artifacts
From the Data passing tutorial -- a preprocessing component that writes to both an artifact handle and an output path.
from kfp.dsl import component, Output, OutputPath, Dataset
@component
def preprocess(
message: str,
output_dataset_one: Output[Dataset],
output_dataset_two_path: OutputPath('Dataset'),
):
with open(output_dataset_one.path, 'w') as f:
f.write(message)
with open(output_dataset_two_path, 'w') as f:
f.write(message)
output_dataset_oneis anOutput[Dataset]handle -- the runtime provides a.pathattribute pointing to a writable location.output_dataset_two_pathis anOutputPath('Dataset')-- the runtime passes the writable path directly as a string.
Example 2: Container Component
From the sequential pipeline sample -- a container component that downloads a file from GCS.
from kfp import dsl
@dsl.container_component
def gcs_download_op(url: str, output: dsl.OutputPath(str)):
return dsl.ContainerSpec(
image='google/cloud-sdk:279.0.0',
command=['sh', '-c'],
args=['gsutil cat $0 | tee $1', url, output],
)
- The function returns a
dsl.ContainerSpecrather than executing Python logic directly. - The
outputparameter is anOutputPath(str)-- the runtime substitutes the actual output file path at execution time.