Principle:Helicone Helicone Model Author Metadata
| Knowledge Sources | |
|---|---|
| Domains | Model Registry, Type System, LLM Metadata |
| Last Updated | 2026-02-14 00:00 GMT |
Overview
Model and author metadata is a structured type system that captures the intrinsic specifications of an LLM (context length, modality, tokenizer) and the organizational identity of its author, decoupled from any specific provider or deployment configuration.
Description
In a multi-model, multi-provider LLM gateway, each model has two distinct categories of metadata. The first is the model's intrinsic specification: its name, author, description, context window size, maximum output token count, creation date, supported input/output modalities, and tokenizer family. These properties are inherent to the model itself and do not change regardless of which provider serves it.
The second category is the author's organizational metadata: how many models the author has, whether the author is fully supported, and optional fields like name, slug, description, website, and API URL. This metadata enables the frontend to group models by author and display author-level information.
By defining these as separate typed interfaces, the system enforces a clean separation between "what the model is" (ModelConfig) and "who made it" (AuthorMetadata). This separation matters because the same model (e.g., Llama 3) can appear across many providers (Groq, DeepInfra, Bedrock), and its intrinsic specification should be defined once at the author/model level, not duplicated per provider endpoint.
Usage
Use these type definitions when:
- Defining a new model's intrinsic properties (context length, modality, tokenizer)
- Adding a new author to the registry
- Building UI that displays model cards, model comparison tables, or author groupings
- Validating that a model definition conforms to the expected schema
Theoretical Basis
This principle applies Normalization from database design to type definitions. Just as a relational schema avoids repeating a customer's name in every order row, the model registry avoids repeating a model's context length and tokenizer in every provider endpoint configuration. The ModelConfig is the "normalized" record for the model, and endpoint configurations reference it by model name.
The AuthorMetadata follows the Aggregate Root pattern: authors are the top-level organizational unit, and models belong to authors. The AuthorName union type, derived from a const array, ensures that only recognized authors can be referenced, enforcing referential integrity at compile time.
The Modality interface uses a Capability Declaration pattern: rather than describing what the model cannot do, it explicitly declares which input and output formats the model supports. This positive declaration enables the UI to filter models by capability (e.g., "show me all models that accept image input").
Schema:
ModelConfig:
name: string
author: AuthorName
description: string
contextLength: integer
maxOutputTokens: integer
created: ISO 8601 datetime
modality: { inputs: InputModality[], outputs: OutputModality[] }
tokenizer: Tokenizer
AuthorMetadata:
modelCount: integer
supported: boolean
name?: string
slug?: string
description?: string
website?: string
apiUrl?: string