Implementation:EvolvingLMMs Lab Lmms eval TUI App Component
| Knowledge Sources | |
|---|---|
| Domains | Web_UI, Evaluation, Frontend |
| Last Updated | 2026-02-14 00:00 GMT |
Overview
The TUI App Component (/tmp/kapso_repo_sslb_59s/lmms_eval/tui/web/src/App.tsx) is the main React component for the LMMs-Eval web interface. This 913-line TypeScript/React implementation provides a complete browser-based UI for configuring and running evaluations with real-time output streaming.
The interface features model/task selection, parameter configuration, command preview, and live log monitoring.
File Location
/tmp/kapso_repo_sslb_59s/lmms_eval/tui/web/src/App.tsx
Key Components
Custom Syntax Highlighting
const SHELL_KEYWORDS = new Set([
'export', 'python', 'python3', 'uv', 'pip', 'node', 'npm', 'git',
'cd', 'ls', 'echo', 'rm', 'mkdir', 'touch', 'alias', 'source', 'env'
])
const ANSI_COLORS: Record<string, string> = {
'30': 'text-neutral-900',
'31': 'text-red-600',
'32': 'text-green-600',
// ... more color mappings
}
function highlightShell(code: string) {
const tokens: any[] = []
let remaining = code
let i = 0
while (remaining.length > 0) {
// Match comments
let match = remaining.match(/^#.*/)
if (match) {
tokens.push(<span key={i++} className="text-neutral-400 italic">{match[0]}</span>)
remaining = remaining.slice(match[0].length)
continue
}
// Match strings
match = remaining.match(/^(['"])(?:(?!\1)[^\\]|\\.)*\1/)
if (match) {
tokens.push(<span key={i++} className="text-neutral-900 bg-neutral-100/50 rounded-[1px]">{match[0]}</span>)
remaining = remaining.slice(match[0].length)
continue
}
// Match variables
match = remaining.match(/^(\$[a-zA-Z_][a-zA-Z0-9_]*|\$\{[^}]+\})/)
if (match) {
tokens.push(<span key={i++} className="text-neutral-800 font-medium">{match[0]}</span>)
remaining = remaining.slice(match[0].length)
continue
}
// Match flags
match = remaining.match(/^(-+[a-zA-Z0-9_-]+)/)
if (match) {
tokens.push(<span key={i++} className="text-neutral-500 font-medium">{match[0]}</span>)
remaining = remaining.slice(match[0].length)
continue
}
// Match operators
match = remaining.match(/^[=&|;>]/)
if (match) {
tokens.push(<span key={i++} className="text-neutral-400 font-bold px-[1px]">{match[0]}</span>)
remaining = remaining.slice(match[0].length)
continue
}
// Match whitespace
match = remaining.match(/^\s+/)
if (match) {
tokens.push(<span key={i++}>{match[0]}</span>)
remaining = remaining.slice(match[0].length)
continue
}
// Match words
match = remaining.match(/^[^\s#$'"=&|;>-]+/)
if (match) {
const word = match[0]
if (SHELL_KEYWORDS.has(word)) {
tokens.push(<span key={i++} className="text-neutral-700 font-bold">{word}</span>)
} else {
tokens.push(<span key={i++} className="text-neutral-600">{word}</span>)
}
remaining = remaining.slice(word.length)
continue
}
tokens.push(<span key={i++}>{remaining[0]}</span>)
remaining = remaining.slice(1)
}
return tokens
}The shell highlighter uses regex-based tokenization to identify:
- Comments (
#...) - Strings (single/double quoted)
- Variables (
$VAR,${VAR}) - Flags (
--flag,-f) - Operators (
=,&,|,;,>) - Keywords (export, python, etc.)
function highlightLog(line: string) {
const ansiRegex = /(?:\x1b)?\[([0-9;]+)m/g
const parts: any[] = []
let lastIndex = 0
let currentStyle = 'text-neutral-600'
let isBold = false
let i = 0
let match
while ((match = ansiRegex.exec(line)) !== null) {
if (match.index > lastIndex) {
const text = line.slice(lastIndex, match.index)
const className = `${currentStyle}${isBold ? ' font-semibold' : ''}`
parts.push(<span key={i++} className={className}>{text}</span>)
}
const codes = match[1].split(';')
for (const code of codes) {
if (code === '0') {
currentStyle = 'text-neutral-600'
isBold = false
} else if (code === '1') {
isBold = true
} else if (ANSI_COLORS[code]) {
currentStyle = ANSI_COLORS[code]
}
}
lastIndex = ansiRegex.lastIndex
}
if (lastIndex < line.length) {
const text = line.slice(lastIndex)
const className = `${currentStyle}${isBold ? ' font-semibold' : ''}`
parts.push(<span key={i++} className={className}>{text}</span>)
}
return parts.length > 0 ? parts : line
}The ANSI highlighter parses escape sequences:
- Extracts color codes from
\x1b[...mpatterns - Tracks current style and bold state
- Applies corresponding CSS classes
- Handles reset codes (0) and bold (1)
ShellEditor Component
interface ShellEditorProps {
value: string
onChange: (value: string) => void
placeholder?: string
className?: string
}
function ShellEditor({ value, onChange, placeholder, className = '' }: ShellEditorProps) {
const textareaRef = useRef<HTMLTextAreaElement>(null)
const preRef = useRef<HTMLPreElement>(null)
const handleScroll = () => {
if (textareaRef.current && preRef.current) {
preRef.current.scrollTop = textareaRef.current.scrollTop
preRef.current.scrollLeft = textareaRef.current.scrollLeft
}
}
return (
<div className={`relative group bg-white border border-neutral-200 transition-colors focus-within:border-black overflow-hidden ${className}`}>
<pre
ref={preRef}
className="absolute inset-0 px-3 py-2 text-xs font-mono leading-relaxed whitespace-pre pointer-events-none overflow-hidden text-transparent"
style={{ fontFamily: 'monospace' }}
aria-hidden="true"
>
{value ? highlightShell(value) : <span className="text-neutral-300 italic">{placeholder}</span>}
<br />
</pre>
<textarea
ref={textareaRef}
value={value}
onChange={e => onChange(e.target.value)}
onScroll={handleScroll}
placeholder={placeholder}
className="relative z-10 w-full h-full bg-transparent text-transparent caret-black px-3 py-2 text-xs font-mono leading-relaxed resize-none focus:outline-none whitespace-pre overflow-auto scrollbar-thin scrollbar-thumb-neutral-200 scrollbar-track-transparent"
style={{ fontFamily: 'monospace' }}
spellCheck={false}
autoCapitalize="off"
autoComplete="off"
/>
</div>
)
}ShellEditor implementation:
- Overlay pattern:
for highlighting,<textarea>for input - Textarea is transparent with visible caret
- Pre element renders highlighted version
- Synchronized scrolling via
onScrollhandler - Both elements have identical padding, font, and sizing
Custom Select Component
interface SelectProps {
value: string
onChange: (value: string) => void
options: { value: string; label: string }[]
placeholder?: string
}
function Select({ value, onChange, options, placeholder }: SelectProps) {
const [open, setOpen] = useState(false)
const [search, setSearch] = useState('')
const ref = useRef<HTMLDivElement>(null)
useEffect(() => {
const handleClickOutside = (e: MouseEvent) => {
if (ref.current && !ref.current.contains(e.target as Node)) {
setOpen(false)
}
}
document.addEventListener('mousedown', handleClickOutside)
return () => document.removeEventListener('mousedown', handleClickOutside)
}, [])
useEffect(() => {
if (open) {
setSearch('')
}
}, [open])
const selectedOption = options.find(o => o.value === value)
const filteredOptions = options.filter(o =>
o.label.toLowerCase().includes(search.toLowerCase()) ||
o.value.toLowerCase().includes(search.toLowerCase())
)
return (
<div ref={ref} className="relative">
<button
type="button"
onClick={() => setOpen(!open)}
className="w-full flex items-center justify-between bg-white border border-neutral-200 px-3 py-2 text-xs font-mono focus:border-black focus:outline-none transition-colors text-left text-neutral-600 hover:border-neutral-300"
>
<span className={selectedOption ? 'text-neutral-600' : 'text-neutral-400'}>
{selectedOption?.label || placeholder || 'Select...'}
</span>
<span className={`text-[10px] text-neutral-400 transition-transform ${open ? 'rotate-180' : ''}`}>▼</span>
</button>
{open && (
<div className="absolute z-50 left-0 right-0 mt-1 bg-white border border-neutral-200 shadow-lg max-h-60 overflow-hidden flex flex-col">
<div className="p-2 border-b border-neutral-100">
<input
autoFocus
value={search}
onChange={e => setSearch(e.target.value)}
onClick={e => e.stopPropagation()}
placeholder="Search..."
className="w-full text-xs font-mono px-2 py-1 bg-neutral-50 border border-neutral-200 text-neutral-600 focus:border-black focus:outline-none"
/>
</div>
<div className="overflow-auto">
{filteredOptions.length > 0 ? (
filteredOptions.map(option => (
<div
key={option.value}
onClick={() => {
onChange(option.value)
setOpen(false)
}}
className={`px-3 py-2 text-xs font-mono cursor-pointer transition-colors ${
option.value === value
? 'bg-black text-white'
: 'text-neutral-600 hover:bg-neutral-50'
}`}
>
{option.label}
</div>
))
) : (
<div className="px-3 py-2 text-xs text-neutral-400 italic">No matches found</div>
)}
</div>
</div>
)}
</div>
)
}Select component features:
- Click-outside detection to close dropdown
- Searchable via input field
- Filters by label or value
- Auto-focus search on open
- Keyboard-accessible
- Visual indication of selected item
Task Tree Structure
type TaskNode =
| { type: 'group', id: string, label: string, children: TaskInfo[] }
| { type: 'leaf', task: TaskInfo }
const visibleNodes = useMemo(() => {
const nodes: TaskNode[] = []
const allGroups = tasks.filter(t => t.group)
const allLeaves = tasks.filter(t => !t.group)
const filteredLeaves = allLeaves.filter(t =>
t.id.toLowerCase().includes(taskFilter.toLowerCase()) ||
t.name.toLowerCase().includes(taskFilter.toLowerCase())
)
const groupChildrenMap = new Map<string, TaskInfo[]>()
const assignedLeafIds = new Set<string>()
for (const group of allGroups) {
const children = filteredLeaves.filter(leaf =>
leaf.id.startsWith(`${group.id}_`) || leaf.id.startsWith(`${group.id}-`)
)
if (children.length > 0) {
groupChildrenMap.set(group.id, children)
children.forEach(c => assignedLeafIds.add(c.id))
nodes.push({
type: 'group',
id: group.id,
label: group.id,
children: children
})
}
}
const topLevelLeaves = filteredLeaves.filter(leaf => !assignedLeafIds.has(leaf.id))
topLevelLeaves.forEach(leaf => {
nodes.push({ type: 'leaf', task: leaf })
})
nodes.sort((a, b) => {
const idA = a.type === 'group' ? a.id : a.task.id
const idB = b.type === 'group' ? b.id : b.task.id
return idA.localeCompare(idB)
})
return nodes
}, [tasks, taskFilter])Task tree computation:
1. Separate groups from leaf tasks
2. Filter leaves by search term
3. Assign leaves to groups based on prefix matching (group_task or group-task)
4. Track assigned leaves to avoid duplication
5. Add ungrouped leaves as top-level nodes
6. Sort alphabetically
This is memoized to avoid recalculation on every render.
Main App Component
export default function App() {
const [version, setVersion] = useState('...')
const [gitInfo, setGitInfo] = useState<GitInfo>({ branch: '', commit: '' })
const [sysInfo, setSysInfo] = useState<SysInfo>({ hostname: '', cwd: '' })
const [models, setModels] = useState<ModelInfo[]>([])
const [tasks, setTasks] = useState<TaskInfo[]>([])
const [model, setModel] = useState('openai_compatible')
const [modelArgs, setModelArgs] = useState('model_version=allenai/molmo-2-8b:free')
const [envVars, setEnvVars] = useState(
'export HF_HOME=${HF_HOME:-~/.cache/huggingface}\n' +
'export OPENAI_API_KEY=${OPENROUTER_API_KEY}\n' +
'export OPENAI_API_BASE=https://openrouter.ai/api/v1'
)
const [selectedTasks, setSelectedTasks] = useState<Set<string>>(new Set(['mme']))
const [taskFilter, setTaskFilter] = useState('')
const [batchSize, setBatchSize] = useState('1')
const [limit, setLimit] = useState('5')
const [device, setDevice] = useState('')
const [outputPath, setOutputPath] = useState('./logs/openrouter_test/')
const [verbosity, setVerbosity] = useState('DEBUG')
const [status, setStatus] = useState<Status>('ready')
const [jobId, setJobId] = useState<string | null>(null)
const [output, setOutput] = useState<string[]>(['Ready to evaluate'])
const [command, setCommand] = useState('')
const [collapsedGroups, setCollapsedGroups] = useState<Set<string>>(new Set())
const [configExpanded, setConfigExpanded] = useState(true)
const [tasksExpanded, setTasksExpanded] = useState(true)
const [envVarsExpanded, setEnvVarsExpanded] = useState(true)
const [logsMaximized, setLogsMaximized] = useState(false)
const outputRef = useRef<HTMLDivElement>(null)
// ... rest of component
}State organization:
- Server data: version, git info, system info, models, tasks
- Configuration: model, args, tasks, env vars, batch size, limit, device, output path, verbosity
- Runtime: status, job ID, output lines, command preview
- UI state: collapsed groups, expanded sections, maximized logs
Data Fetching
useEffect(() => {
fetch(`${API_BASE}/health`)
.then(r => r.json())
.then(d => {
setVersion(d.version)
if (d.git) setGitInfo(d.git)
if (d.system) setSysInfo(d.system)
})
.catch(() => setVersion('error'))
fetch(`${API_BASE}/models`)
.then(r => r.json())
.then(setModels)
.catch(() => setModels([]))
fetch(`${API_BASE}/tasks`)
.then(r => r.json())
.then(setTasks)
.catch(() => setTasks([]))
}, [])On component mount:
- Fetch health info (version, git, system)
- Fetch available models
- Fetch available tasks
- Handle errors gracefully (empty arrays, error message)
Command Preview
useEffect(() => {
const config: Config = {
model,
model_args: modelArgs,
tasks: Array.from(selectedTasks),
env_vars: envVars,
batch_size: parseInt(batchSize) || 1,
limit: limit ? parseInt(limit) : null,
output_path: outputPath,
log_samples: true,
verbosity,
device: device || null,
}
fetch(`${API_BASE}/eval/preview`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(config),
})
.then(r => r.json())
.then(d => setCommand(d.command))
.catch(() => setCommand('# Error generating command'))
}, [model, modelArgs, selectedTasks, envVars, batchSize, limit, device, outputPath, verbosity])Real-time command preview:
- Triggers on any configuration change
- Sends current config to
/eval/preview - Updates command display
- Provides immediate feedback
Auto-Scroll Logs
useEffect(() => {
if (outputRef.current) {
outputRef.current.scrollTop = outputRef.current.scrollHeight
}
}, [output])Automatically scrolls log output to bottom when new lines are added.
Job Execution
const startEval = async () => {
if (selectedTasks.size === 0) {
setOutput(['Error: No tasks selected'])
return
}
setStatus('running')
setOutput(['Starting evaluation...'])
const config: Config = {
model,
model_args: modelArgs,
tasks: Array.from(selectedTasks),
env_vars: envVars,
batch_size: parseInt(batchSize) || 1,
limit: limit ? parseInt(limit) : null,
output_path: outputPath,
log_samples: true,
verbosity,
device: device || null,
}
try {
const res = await fetch(`${API_BASE}/eval/start`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(config),
})
const data = await res.json()
setJobId(data.job_id)
const eventSource = new EventSource(`${API_BASE}/eval/${data.job_id}/stream`)
eventSource.onmessage = (event) => {
const d = JSON.parse(event.data)
if (d.type === 'output') {
setOutput(prev => [...prev, d.line])
} else if (d.type === 'done') {
setStatus(d.exit_code === 0 ? 'completed' : 'error')
setOutput(prev => [...prev, '', `Evaluation ${d.exit_code === 0 ? 'completed' : 'failed'} (exit: ${d.exit_code})`])
eventSource.close()
} else if (d.type === 'stopped') {
setStatus('stopped')
setOutput(prev => [...prev, '', 'Evaluation stopped'])
eventSource.close()
} else if (d.type === 'error') {
setOutput(prev => [...prev, `Error: ${d.message}`])
setStatus('error')
eventSource.close()
}
}
eventSource.onerror = () => {
setStatus('error')
setOutput(prev => [...prev, 'Connection error'])
eventSource.close()
}
} catch (e) {
setOutput([`Failed to start: ${e}`])
setStatus('error')
}
}Job execution flow:
1. Validate tasks are selected
2. Set status to running, clear output
3. Build config object
4. POST to /eval/start, receive job ID
5. Open EventSource to /eval/{job_id}/stream
6. Handle event types:
*output: Append log line *done: Set completed/error status, close connection *stopped: Set stopped status, close connection *error: Log error, close connection
7. Handle connection errors
Job Termination
const stopEval = async () => {
if (!jobId) return
try {
await fetch(`${API_BASE}/eval/${jobId}/stop`, { method: 'POST' })
setStatus('stopped')
} catch (e) {
setOutput(prev => [...prev, `Failed to stop: ${e}`])
}
}Simple termination: POST to stop endpoint, update status.
Layout Structure
return (
<div className="flex flex-col h-screen bg-white text-neutral-900 font-light selection:bg-black selection:text-white">
<header className="relative h-14 flex items-center justify-between px-6 border-b border-neutral-200 bg-white/80 backdrop-blur-md z-10">
{/* Version, git info, status badge */}
</header>
<div className="flex flex-1 overflow-hidden">
<div className="w-full md:w-[400px] lg:w-[450px] xl:w-[500px] 2xl:w-[550px] min-w-[320px] max-w-[600px] bg-white border-r border-neutral-200 flex flex-col overflow-y-auto scrollbar-thin flex-shrink-0">
{/* Configuration sidebar */}
</div>
<div className="flex-1 flex flex-col bg-neutral-50/30 min-w-0">
<div className="px-6 py-4 border-b border-neutral-200 bg-white flex gap-3 justify-start">
{/* Start/Stop buttons */}
</div>
{!logsMaximized && (
<div className="h-1/3 border-b border-neutral-200 flex flex-col bg-white transition-all duration-300">
{/* Command preview */}
</div>
)}
<div className="flex-1 flex flex-col bg-white transition-all duration-300 min-h-0">
{/* Log output */}
</div>
</div>
</div>
</div>
)Layout structure:
- Full-height flexbox container
- Fixed header with version/status
- Two-column layout:
* Left: Fixed-width sidebar (responsive breakpoints) * Right: Flexible main panel
- Main panel split:
* Top: Command preview (1/3 height, hideable) * Bottom: Log output (flexible height)
Responsive Behavior
Sidebar Width:
- Base: full width
- md (768px): 400px
- lg (1024px): 450px
- xl (1280px): 500px
- 2xl (1536px): 550px
- Min: 320px, Max: 600px
Maximize Logs: Hides command preview panel to give full height to logs
Collapsible Sections: All major sections can be collapsed to save space
Visual Design
Color Palette:
- Background: white (
bg-white) - Text: neutral grays (
text-neutral-600,text-neutral-900) - Accents: black (
border-black,bg-black) - Status colors: green (completed), red (error)
Typography:
- Base: system font, light weight
- Code/logs: monospace font
- Labels: uppercase, wide tracking, small size
Spacing:
- Consistent padding: 6 units (
px-6,py-6) - Form elements: 4-unit gaps (
space-y-4) - Borders: 1px neutral (
border-neutral-200)
Interactions:
- Hover states: darker borders, background changes
- Focus: black borders (
focus:border-black) - Transitions: smooth 200ms (
transition-colors) - Selection: black background, white text
Performance Optimizations
useMemo: Task tree computation memoized to avoid recalculation on every render
Event Delegation: Click handlers on parent elements rather than individual items
Stable Keys: List items use stable keys (task IDs) for efficient reconciliation
Conditional Rendering: Collapsed sections not rendered (rather than hidden with CSS)
Related Principles
- Principle:EvolvingLMMs_Lab_Lmms_eval_TUI_Web_Interface
- Principle:EvolvingLMMs_Lab_Lmms_eval_Task_Selection
- Principle:EvolvingLMMs_Lab_Lmms_eval_Model_Type_Selection
Related Implementations
- Implementation:EvolvingLMMs_Lab_Lmms_eval_TUI_Server - Backend API consumed by this component
- Implementation:EvolvingLMMs_Lab_Lmms_eval_TUI_Web_Dependencies - NPM dependencies