Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Ggml org Llama cpp Android MainActivity

From Leeroopedia
Knowledge Sources
Domains Android, UI
Last Updated 2026-02-15 00:00 GMT

Overview

Main Android activity that provides a chat UI for interacting with a locally loaded GGUF language model on an Android device.

Description

Extends `AppCompatActivity` and manages a RecyclerView-based chat interface with user and assistant messages. On launch, it initializes the Arm AI Chat `InferenceEngine` via `AiChat.getInferenceEngine()`. The FAB button either prompts the user to select a GGUF model file (via `OpenDocument` contract) or sends user input to the engine. When a model is selected, it parses GGUF metadata, copies the file to internal storage, loads it into the engine, sets a system prompt, and streams generated tokens into the chat via Kotlin Flow.

Usage

Use this as the main entry point and reference implementation for the llama.cpp Android example app, demonstrating end-to-end on-device LLM inference with model loading, chat templating, and streaming token generation.

Code Reference

Source Location

  • Repository: Ggml_org_Llama_cpp
  • File: examples/llama.android/app/src/main/java/com/example/llama/MainActivity.kt
  • Lines: 1-275

Signature

class MainActivity : AppCompatActivity() {
    private lateinit var engine: InferenceEngine
    private var generationJob: Job?

    override fun onCreate(savedInstanceState: Bundle?)
    private fun handleUserInput()
    private fun loadModel(uri: Uri)
}

Import

import android.net.Uri
import android.os.Bundle
import androidx.appcompat.app.AppCompatActivity
import androidx.lifecycle.lifecycleScope
import com.arm.aichat.AiChat
import com.arm.aichat.InferenceEngine
import com.arm.aichat.gguf.GgufMetadata
import com.arm.aichat.gguf.GgufMetadataReader
import kotlinx.coroutines.Dispatchers
import kotlinx.coroutines.flow.onCompletion
import kotlinx.coroutines.launch

I/O Contract

Inputs

Name Type Required Description
GGUF file Uri Yes User-selected GGUF model file via Android document picker
user input String Yes User chat message entered in the EditText field

Outputs

Name Type Description
chat messages RecyclerView Streamed assistant response tokens displayed in a chat interface
GGUF metadata TextView Model metadata displayed in the header

Usage Examples

// Initialize inference engine
lifecycleScope.launch(Dispatchers.Default) {
    engine = AiChat.getInferenceEngine(applicationContext)
}

// Send user prompt and collect streamed tokens
engine.sendUserPrompt(userMessage)
    .onCompletion { /* handle completion */ }
    .collect { token -> appendToChat(token) }

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment