Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Kubeflow Pipelines Dsl Component Decorator

From Leeroopedia
Metadata
Knowledge Sources
Domains
Last Updated 2026-02-13 00:00 GMT

Overview

Concrete tool for defining pipeline components as decorated Python functions or container specs provided by the KFP SDK. The @dsl.component and @dsl.container_component decorators are the primary entry points for authoring Kubeflow pipeline steps.

Description

The @dsl.component decorator transforms a Python function into a lightweight KFP component. The function body is serialized and executed inside a container at runtime. Dependencies beyond the base image are specified via the packages_to_install parameter.

The @dsl.container_component decorator defines a component that runs a custom container image. Instead of serializing the function body, the decorated function returns a dsl.ContainerSpec that specifies the image, command, and arguments.

Both decorators produce component specifications with typed I/O that the KFP compiler uses to build the pipeline graph (Argo Workflow or KFP v2 IR).

Usage

Import when defining pipeline steps:

from kfp import dsl
from kfp.dsl import Input, InputPath, Output, OutputPath, Dataset, Model, component
  • Use @dsl.component for pure Python logic that can be expressed in a self-contained function.
  • Use @dsl.container_component for container-based logic requiring custom images or non-Python tooling.

Code Reference

Source Location

  • Repository: kubeflow/pipelines
  • File: samples/tutorials/Data passing in python components/Data passing in python components - Files.py (lines 40--128 for component examples)
  • File: samples/core/sequential/sequential.py (lines 20--35 for container components)

Signature

@dsl.component(
    base_image: str = 'python:3.9',
    packages_to_install: List[str] = None,
    kfp_package_path: str = None,
)
def component_func(
    # Input parameters
    param: str,
    input_artifact: Input[Dataset],
    input_path: InputPath('Dataset'),
    # Output parameters
    output_artifact: Output[Dataset],
    output_path: OutputPath('Dataset'),
) -> Optional[ReturnType]:
    ...


@dsl.container_component
def container_func(
    param: str,
    output: dsl.OutputPath(str),
) -> dsl.ContainerSpec:
    return dsl.ContainerSpec(
        image='...',
        command=['...'],
        args=['...'],
    )

Import

from kfp import dsl
from kfp.dsl import Input, InputPath, Output, OutputPath, Dataset, Model, component

I/O Contract

Inputs

Input Contract
Name Type Description
base_image str (decorator param) Base container image for the component. Defaults to 'python:3.9'.
packages_to_install List[str] (decorator param) Python packages to pip-install at runtime before executing the function body.
kfp_package_path str (decorator param) Optional override path to the KFP SDK package.
Function parameters Typed via hints Each function parameter becomes a component input. Primitives (str, int, float, bool) are passed inline. Artifacts are passed via Input[T] or InputPath('T').

Outputs

Output Contract
Name Type Description
ComponentSpec Pipeline graph node At compile time, the decorator produces a component specification embedded in the pipeline IR (Intermediate Representation).
Return value Typed via return hint If the function has a return type annotation, the returned value becomes a named output parameter.
Artifacts Output[T] / OutputPath At runtime, artifacts are written to the path provided by the Output[T].path handle or the OutputPath('T') string, and are automatically tracked by the metadata store.

Usage Examples

Example 1: Python Component with Typed Artifacts

From the Data passing tutorial -- a preprocessing component that writes to both an artifact handle and an output path.

from kfp.dsl import component, Output, OutputPath, Dataset


@component
def preprocess(
    message: str,
    output_dataset_one: Output[Dataset],
    output_dataset_two_path: OutputPath('Dataset'),
):
    with open(output_dataset_one.path, 'w') as f:
        f.write(message)

    with open(output_dataset_two_path, 'w') as f:
        f.write(message)
  • output_dataset_one is an Output[Dataset] handle -- the runtime provides a .path attribute pointing to a writable location.
  • output_dataset_two_path is an OutputPath('Dataset') -- the runtime passes the writable path directly as a string.

Example 2: Container Component

From the sequential pipeline sample -- a container component that downloads a file from GCS.

from kfp import dsl


@dsl.container_component
def gcs_download_op(url: str, output: dsl.OutputPath(str)):
    return dsl.ContainerSpec(
        image='google/cloud-sdk:279.0.0',
        command=['sh', '-c'],
        args=['gsutil cat $0 | tee $1', url, output],
    )
  • The function returns a dsl.ContainerSpec rather than executing Python logic directly.
  • The output parameter is an OutputPath(str) -- the runtime substitutes the actual output file path at execution time.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment