Implementation:Cohere ai Cohere python FinetuneDatasetMetrics Model
| Knowledge Sources | |
|---|---|
| Domains | SDK, Fine-tuning |
| Last Updated | 2026-02-15 14:00 GMT |
Overview
FinetuneDatasetMetrics is a Pydantic model representing statistical metrics about a fine-tuning dataset, including token counts, example counts, and byte sizes for training and evaluation splits.
Description
The FinetuneDatasetMetrics class provides quantitative information about a dataset used for fine-tuning a Cohere model. All fields are optional, as metrics may not be fully computed at all stages of the fine-tuning pipeline. The metrics cover:
- trainable_token_count: The number of tokens from valid examples that can actually be used for training
- total_examples: The total number of examples in the dataset
- train_examples: The number of examples allocated to the training split
- train_size_bytes: The total byte size of training examples
- eval_examples: The number of examples allocated to the evaluation split
- eval_size_bytes: The total byte size of evaluation examples
These metrics are useful for understanding dataset composition, estimating training costs, and verifying that train/eval splits are properly balanced.
The class extends UncheckedBaseModel and is auto-generated by the Fern API definition toolchain.
Usage
Use FinetuneDatasetMetrics when inspecting the details of a fine-tuning job or dataset to understand the volume and distribution of training data. This model is typically accessed as a nested field within fine-tuning job or dataset response objects.
Code Reference
Source Location
- Repository: Cohere Python SDK
- File:
src/cohere/types/finetune_dataset_metrics.py
Signature
class FinetuneDatasetMetrics(UncheckedBaseModel):
trainable_token_count: typing.Optional[int] = None
total_examples: typing.Optional[int] = None
train_examples: typing.Optional[int] = None
train_size_bytes: typing.Optional[int] = None
eval_examples: typing.Optional[int] = None
eval_size_bytes: typing.Optional[int] = None
Import
from cohere.types import FinetuneDatasetMetrics
I/O Contract
Fields
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
trainable_token_count |
Optional[int] |
No | None |
The number of tokens of valid examples that can be used for training |
total_examples |
Optional[int] |
No | None |
The overall number of examples in the dataset |
train_examples |
Optional[int] |
No | None |
The number of training examples |
train_size_bytes |
Optional[int] |
No | None |
The size in bytes of all training examples |
eval_examples |
Optional[int] |
No | None |
The number of evaluation examples |
eval_size_bytes |
Optional[int] |
No | None |
The size in bytes of all evaluation examples |
Usage Examples
Inspecting Fine-tune Dataset Metrics
import cohere
co = cohere.Client()
# Get fine-tuned model details
finetuned_model = co.finetuning.get_finetuned_model(id="my-finetune-id")
# Access dataset metrics from the fine-tuning job
metrics = finetuned_model.settings.dataset_metrics
if metrics:
print(f"Total examples: {metrics.total_examples}")
print(f"Training examples: {metrics.train_examples}")
print(f"Evaluation examples: {metrics.eval_examples}")
print(f"Trainable token count: {metrics.trainable_token_count}")
if metrics.train_size_bytes:
print(f"Training data size: {metrics.train_size_bytes / 1024:.1f} KB")
if metrics.eval_size_bytes:
print(f"Evaluation data size: {metrics.eval_size_bytes / 1024:.1f} KB")
Constructing Metrics Directly
from cohere.types import FinetuneDatasetMetrics
metrics = FinetuneDatasetMetrics(
trainable_token_count=150000,
total_examples=1000,
train_examples=800,
train_size_bytes=2048000,
eval_examples=200,
eval_size_bytes=512000,
)
train_eval_ratio = metrics.train_examples / metrics.total_examples
print(f"Train/eval split: {train_eval_ratio:.0%} / {1 - train_eval_ratio:.0%}")
print(f"Avg tokens per example: {metrics.trainable_token_count / metrics.total_examples:.0f}")