Principle:Shiyu coder Kronos Web UI Prediction Service
| Knowledge Sources | |
|---|---|
| Domains | Web_Application, Financial_Prediction, Serving |
| Last Updated | 2026-02-09 14:00 GMT |
Overview
Architectural pattern for serving time-series forecasting models through a REST API with interactive visualization, enabling browser-based access to the Kronos prediction pipeline.
Description
The Web UI Prediction Service principle describes the pattern of exposing a trained machine learning model (specifically a financial time-series forecaster) through a web service layer. This addresses the gap between a trained model artifact and end-user accessibility: rather than requiring Python scripting knowledge, users interact with the model through HTTP endpoints and receive interactive chart visualizations.
The pattern involves:
- Model lifecycle management — Loading/unloading model variants on demand via API
- Data ingestion — Accepting financial OHLCV data from local files (CSV/Feather) with validation
- Prediction orchestration — Coordinating tokenizer encoding, autoregressive generation, and result denormalization
- Visualization — Converting raw predictions into interactive candlestick charts for comparison against ground truth
- Result persistence — Saving prediction outputs for later analysis and reproducibility
This is distinct from batch inference (which processes datasets programmatically) in that it targets interactive, single-request exploration of model behavior.
Usage
Apply this principle when building a user-facing interface for a time-series forecasting model. It is appropriate when the target users need to explore predictions interactively without writing code, and when visual comparison between predicted and actual data is a primary use case.
Theoretical Basis
The Web UI Prediction Service follows a standard Model Serving architecture:
# Abstract serving pattern (NOT implementation code)
# 1. Load model into memory (stateful server)
model = load_model(variant, device)
# 2. Accept prediction request with parameters
request = {data, lookback, pred_len, temperature, ...}
# 3. Preprocess → Predict → Postprocess
preprocessed = validate_and_slice(request.data, request.lookback)
raw_prediction = model.predict(preprocessed, request.pred_len)
formatted_output = attach_timestamps_and_format(raw_prediction)
# 4. Visualize and persist
chart = create_comparison_chart(historical, predicted, actual)
save_results(formatted_output, chart)
The key design decisions are:
- Stateful model server — The model is loaded once and held in memory across requests, avoiding repeated loading overhead
- Request-scoped data — Each prediction request specifies its own data window, enabling exploration of different time periods
- Dual-mode time selection — Users can predict from the latest data or from a custom start date