Principle:Ucbepic Docetl LLM Powered Document Ranking
| Knowledge Sources | |
|---|---|
| Domains | LLM_Data_Processing, Information_Retrieval |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
Prompt-guided document ranking uses a multi-phase approach combining initial ordering (via embeddings or LLM Likert ratings) with sliding-window LLM refinement to order documents by criteria specified in natural language prompts.
Theoretical Basis
Ordering a collection of documents by subjective or complex criteria -- such as "most relevant to climate policy" or "most actionable for a product manager" -- cannot be solved by simple sorting on numeric fields. It requires semantic understanding that only an LLM can provide. However, having an LLM perform all O(n squared) pairwise comparisons is prohibitively expensive. DocETL's rank operation draws on ideas from the human-powered sort literature to achieve high-quality rankings with a bounded LLM call budget.
The operation proceeds in two phases. The initial ordering phase produces a coarse ranking using one of three methods: (1) embedding similarity to the ranking criteria, which is fast and cheap but imprecise; (2) Likert-scale LLM ratings where each document is rated 1-7 against the criteria in parallel batches, providing more nuanced initial ordering; or (3) calibrated embedding sort that uses a small LLM-ranked sample to calibrate embedding-based ordering. The refinement phase then applies a sliding window approach: windows of configurable size move across the ranking, and within each window the LLM selects the top-K items ("picky windows"). Selected items are promoted to the front of the window, progressively refining the ranking. The total number of LLM calls in the refinement phase is bounded by a configurable budget parameter.
This two-phase design achieves a favorable trade-off: the initial ordering places most documents approximately correctly at low cost, while the sliding window refinement uses expensive LLM calls only where they have the most impact -- disambiguating items that are close in quality. The approach is particularly effective when only the top-K items matter, as the refinement can terminate early once the top positions are stable.
Key Design Decisions
| Decision | Choice | Rationale |
|---|---|---|
| Initial ordering | Three strategies: embedding similarity, Likert LLM ratings, or calibrated embedding | Provides a cost-quality spectrum; embedding is cheapest, Likert is most accurate, calibrated embedding balances both |
| Refinement approach | Sliding picky windows with bounded LLM call budget | Concentrates expensive LLM calls where they have the most impact; budget parameter gives direct cost control |
| Direction support | Configurable ascending or descending ordering | Supports both "best first" and "worst first" use cases with the same underlying algorithm |