Implementation:Cohere ai Cohere python DatasetsClient Create
Appearance
| Field | Value |
|---|---|
| Type | Implementation |
| Source | Cohere Python SDK |
| Domain | Data Ingestion Fine-tuning Dataset Management |
| Last Updated | 2026-02-15 |
| Implements | Principle:Cohere_ai_Cohere_python_Dataset_Upload |
Overview
Concrete method for uploading and validating datasets to Cohere's managed storage.
Description
DatasetsClient.create() uploads a data file with metadata and type classification. Supports training data and optional eval data in a single call. The wait() utility polls DatasetsClient.get() at configurable intervals until the dataset status indicates validation is complete.
Code Reference
src/cohere/datasets/client.pyLines L112-200 (create)src/cohere/datasets/client.pyLines L229-258 (get)src/cohere/utils.pyLines L93-116 (wait)
Signature
def create(
self, *, name: str, type: DatasetType, data: core.File,
keep_original_file: typing.Optional[bool] = None,
skip_malformed_input: typing.Optional[bool] = None,
keep_fields: typing.Optional[typing.Union[str, typing.Sequence[str]]] = None,
optional_fields: typing.Optional[typing.Union[str, typing.Sequence[str]]] = None,
text_separator: typing.Optional[str] = None,
csv_delimiter: typing.Optional[str] = None,
eval_data: typing.Optional[core.File] = None,
request_options: typing.Optional[RequestOptions] = None,
) -> DatasetsCreateResponse:
Import
Access via client.datasets.create() and client.wait()
Inputs
| Parameter | Type | Required | Description |
|---|---|---|---|
| name | str | Yes | Name for the dataset |
| type | DatasetType | Yes | Dataset type (e.g., "chat-finetune-input") |
| data | File | Yes | Data file in JSONL or CSV format |
| eval_data | Optional[File] | No | Optional evaluation data file |
| skip_malformed_input | Optional[bool] | No | Whether to skip malformed input rows |
Outputs
DatasetsCreateResponse with dataset ID; after wait(): DatasetsGetResponse with validation status.
Example
from cohere import Client
client = Client()
dataset = client.datasets.create(
name="my-finetune-data",
type="chat-finetune-input",
data=open("training_data.jsonl", "rb"),
eval_data=open("eval_data.jsonl", "rb"),
)
validated = client.wait(dataset)
print(f"Dataset {validated.id} validated: {validated.validation_status}")
Related
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment