Principle:Togethercomputer Together python Image Generation Request
| Knowledge Sources | |
|---|---|
| Domains | Computer_Vision, Image_Generation, API_Client |
| Last Updated | 2026-02-15 16:00 GMT |
Overview
A mechanism for generating images from text descriptions by sending structured requests to hosted diffusion models via the Together AI API.
Description
Image Generation Request is the core operation in the Together Python SDK's image workflow. It sends a text prompt along with generation parameters to a hosted diffusion model and receives generated image data in response.
The request supports configuring several aspects of the generation:
- Output dimensions: The
heightandwidthparameters control the pixel resolution of generated images, defaulting to 1024x1024. - Batch generation: The
nparameter controls how many images to generate per request, defaulting to 1. - Reproducibility: The
seedparameter enables deterministic generation; the same seed with the same parameters produces the same image. - Negative prompts: The
negative_promptparameter steers generation away from unwanted visual elements. - Model-specific parameters: Additional keyword arguments such as
steps(number of denoising iterations) andimage_base64(a reference image for image-to-image generation) can be passed via**kwargs.
The SDK provides both synchronous (Images.generate) and asynchronous (AsyncImages.generate) variants with identical parameter signatures.
Usage
Use this principle when you need to generate images from text descriptions using Together AI's hosted models. It applies to any text-to-image or image-to-image generation scenario. The client must be initialized before making generation requests.
Theoretical Basis
Text-to-image generation uses diffusion models that iteratively denoise random noise conditioned on text embeddings. The process follows this pattern:
- Noise initialization: Random Gaussian noise is generated as the starting point. The
seedparameter controls the initial noise state, enabling reproducibility. - Text conditioning: The text prompt is encoded into embeddings by a text encoder (typically CLIP). These embeddings condition every step of the denoising process.
- Iterative denoising: Over a number of steps (controlled by the
stepsparameter), the model progressively removes noise while being guided by the text embeddings. More steps generally improve quality at the cost of generation time. - Negative conditioning: If a negative prompt is provided, its embeddings are used to steer generation away from unwanted concepts via classifier-free guidance.
- Output rendering: The final denoised latent representation is decoded into pixel space at the requested
heightandwidthdimensions.
Pseudo-code:
function generateImage(prompt, model, params):
request = ImageRequest(
prompt = prompt,
model = model,
seed = params.seed,
n = params.n OR 1,
height = params.height OR 1024,
width = params.width OR 1024,
negative_prompt = params.negative_prompt,
steps = params.steps,
)
payload = request.serialize(exclude_none=True)
response = POST("images/generations", payload)
return ImageResponse(
id = response.id,
model = response.model,
object = "list",
data = [ImageChoicesData(index, b64_json, url) for each result]
)