Implementation:Pola rs Polars DataFrame Write Multi Format

Knowledge Sources	polars Polars User Guide - IO
Domains	Data_Engineering, Data_Serialization, Storage_Optimization
Last Updated	2026-02-09 10:00 GMT

Overview

Concrete APIs for writing Polars DataFrames to CSV, Parquet, JSON, Excel, IPC, and database targets, including partitioned writes and streaming sinks.

Description

The DataFrame Write Multi Format APIs serialize DataFrames to various output formats and destinations. Each write_* method on DataFrame handles eager serialization, while sink_* methods on LazyFrame provide streaming output for large datasets. Parquet writes support Hive-style partitioning for optimized downstream query performance.

Usage

Import polars and call the appropriate write method on a DataFrame after all transformations are complete. For database writes, install the ADBC driver for the target database. For streaming writes of large datasets, use LazyFrame sink_* methods instead of materializing with .collect() first.

Code Reference

Source Location

Repository: polars
Files:
- docs/source/src/python/user-guide/io/csv.py (Lines: 13-14)
- docs/source/src/python/user-guide/io/parquet.py (Lines: 13-14)

Signature

# CSV write
DataFrame.write_csv(file: str | Path | IOBase = None) -> str | None

# Parquet write (with optional partitioning)
DataFrame.write_parquet(
    file: str | Path,
    partition_by: list[str] = None,
    compression: str = "zstd",
) -> None

# JSON write
DataFrame.write_json(file: str | Path | IOBase = None) -> str | None

# NDJSON write
DataFrame.write_ndjson(file: str | Path | IOBase = None) -> str | None

# Excel write
DataFrame.write_excel(
    file: str | Path,
    worksheet: str = None,
) -> Workbook

# IPC/Arrow write
DataFrame.write_ipc(file: str | Path) -> None

# Database write
DataFrame.write_database(
    table_name: str,
    connection: str,
    engine: str = "adbc",
    if_table_exists: str = "fail",
) -> int

# Streaming sinks (LazyFrame)
LazyFrame.sink_parquet(path: str | Path) -> DataFrame
LazyFrame.sink_ipc(path: str | Path) -> DataFrame
LazyFrame.sink_csv(path: str | Path) -> DataFrame

Import

import polars as pl

I/O Contract

Inputs

Name	Type	Required	Description
file	str or Path	Yes	Output file path for write operations
partition_by	list[str]	No	Column names for Hive-style partitioned writes (Parquet only)
compression	str	No	Compression codec: "zstd", "snappy", "lz4", "gzip", or "uncompressed" (Parquet)
worksheet	str	No	Name of the Excel worksheet to write to
table_name	str	Yes (database)	Target database table name
connection	str	Yes (database)	Database connection URI (e.g., "postgresql://user:pass@host/db")
engine	str	No	Database engine: "adbc" (default) for Arrow Database Connectivity
if_table_exists	str	No	Behavior when table exists: "fail" (default), "append", or "replace"

Outputs

Name	Type	Description
None	None	Most write methods return None on success (file written to disk)
str	str	write_csv and write_json return the serialized string if no file path is provided
Workbook	xlsxwriter.Workbook	write_excel returns the Workbook object for further customization
int	int	write_database returns the number of rows written
DataFrame	polars.DataFrame	sink_* methods return a DataFrame with metadata about the write operation

Usage Examples

import polars as pl

df = pl.DataFrame({
    "foo": [1, 2, 3],
    "bar": ["a", "b", "c"],
    "category": ["x", "x", "y"],
})

# Write to CSV
df.write_csv("output.csv")

# Write to Parquet with default Zstd compression
df.write_parquet("output.parquet")

# Hive-partitioned Parquet write
df.write_parquet("output/", partition_by=["category"])
# Creates: output/category=x/part-0.parquet
#          output/category=y/part-0.parquet

# Write to JSON
df.write_json("output.json")

# Write to Excel with named worksheet
df.write_excel("output.xlsx", worksheet="Sheet1")

# Write to database via ADBC
df.write_database(
    table_name="my_table",
    connection="postgresql://user:pass@host/db",
    engine="adbc",
)

# Streaming sink for large datasets (LazyFrame)
lf = pl.scan_csv("large_file.csv")
lf.filter(pl.col("value") > 0).sink_parquet("filtered_output.parquet")

Related Pages

Implements Principle

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment