Implementation:Cohere ai Cohere python EmbedJob Model

Knowledge Sources	Cohere Python SDK
Domains	SDK, Embeddings, Batch Processing
Last Updated	2026-02-15 14:00 GMT

Overview

EmbedJob is a Pydantic model representing the metadata and status of a batch embedding job in the Cohere platform, used for asynchronous large-scale embedding operations.

Description

The EmbedJob class models the state of a batch embedding job that processes an entire dataset asynchronously. Instead of embedding texts one request at a time, batch embed jobs allow users to submit a dataset and receive embeddings for all entries once processing completes.

Each embed job tracks:

job_id: The unique identifier for the job
name: An optional human-readable name
status: The current processing state (one of "processing", "complete", "cancelling", "cancelled", or "failed")
created_at: The datetime when the job was created
input_dataset_id: The ID of the dataset being embedded
output_dataset_id: The ID of the resulting dataset containing embeddings (populated upon completion)
model: The embedding model used (e.g., "embed-english-v3.0")
truncate: The truncation strategy applied ("START" or "END")
meta: Optional API metadata

The class extends UncheckedBaseModel and is auto-generated by the Fern API definition toolchain.

Usage

Use EmbedJob when working with the Cohere batch embed jobs API to create, monitor, and retrieve results from large-scale embedding operations. This model is returned by job creation, listing, and status-checking endpoints.

Code Reference

Source Location

Repository: Cohere Python SDK
File: src/cohere/types/embed_job.py

Signature

class EmbedJob(UncheckedBaseModel):
    job_id: str
    name: typing.Optional[str] = None
    status: EmbedJobStatus  # "processing" | "complete" | "cancelling" | "cancelled" | "failed"
    created_at: dt.datetime
    input_dataset_id: str
    output_dataset_id: typing.Optional[str] = None
    model: str
    truncate: EmbedJobTruncate  # "START" | "END"
    meta: typing.Optional[ApiMeta] = None

Import

from cohere.types import EmbedJob

I/O Contract

Fields

Field	Type	Required	Default	Description
`job_id`	`str`	Yes	--	ID of the embed job
`name`	`Optional[str]`	No	`None`	The name of the embed job
`status`	`EmbedJobStatus`	Yes	--	The status of the embed job: `"processing"`, `"complete"`, `"cancelling"`, `"cancelled"`, or `"failed"`
`created_at`	`datetime`	Yes	--	The creation date of the embed job
`input_dataset_id`	`str`	Yes	--	ID of the input dataset
`output_dataset_id`	`Optional[str]`	No	`None`	ID of the resulting output dataset (available when job is complete)
`model`	`str`	Yes	--	ID of the model used to embed
`truncate`	`EmbedJobTruncate`	Yes	--	The truncation option used: `"START"` or `"END"`
`meta`	`Optional[ApiMeta]`	No	`None`	API metadata including token counts and warnings

Usage Examples

Creating and Monitoring an Embed Job

import cohere
import time

co = cohere.Client()

# Create an embed job
job = co.embed_jobs.create(
    model="embed-english-v3.0",
    dataset_id="my-dataset-id",
    input_type="search_document",
    truncate="END",
)

print(f"Job ID: {job.job_id}")
print(f"Status: {job.status}")
print(f"Created at: {job.created_at}")

# Poll for completion
while job.status == "processing":
    time.sleep(10)
    job = co.embed_jobs.get(id=job.job_id)
    print(f"Status: {job.status}")

if job.status == "complete":
    print(f"Output dataset ID: {job.output_dataset_id}")
elif job.status == "failed":
    print("Embed job failed")

Listing Embed Jobs

import cohere

co = cohere.Client()

# List all embed jobs
jobs = co.embed_jobs.list()

for job in jobs.embed_jobs:
    print(f"Job: {job.job_id} | Model: {job.model} | Status: {job.status}")
    if job.name:
        print(f"  Name: {job.name}")
    print(f"  Input dataset: {job.input_dataset_id}")
    print(f"  Truncate: {job.truncate}")

Related Pages

Environment:Cohere_ai_Cohere_python_Python_SDK_Runtime

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment