Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Ggml org Llama cpp Android AI Chat JNI

From Leeroopedia
Knowledge Sources
Domains Android, JNI
Last Updated 2026-02-15 00:00 GMT

Overview

JNI native implementation that exposes llama.cpp model loading, context initialization, prompt processing, and token generation to the Android Kotlin layer.

Description

Manages global state for the llama model, context, batch, chat templates, and sampler. Provides JNI functions mapped to `InferenceEngineImpl` methods: `init` loads backends from the native library directory, `load` loads a GGUF model file, `prepare` initializes the context with configurable thread count (2-4 threads with headroom), `processSystemPrompt` and `processUserPrompt` apply chat templates and decode tokens, and `generateTokens` performs autoregressive sampling returning individual tokens. Uses a batch size of 512 and default context size of 8192.

Usage

Use this native bridge when building Android applications that require on-device LLM inference, as it translates Kotlin API calls into llama.cpp C++ operations with Android-specific optimizations like thread management and Android logging integration.

Code Reference

Source Location

  • Repository: Ggml_org_Llama_cpp
  • File: examples/llama.android/lib/src/main/cpp/ai_chat.cpp
  • Lines: 1-565

Signature

// Constants
constexpr int   N_THREADS_MIN        = 2;
constexpr int   N_THREADS_MAX        = 4;
constexpr int   N_THREADS_HEADROOM   = 2;
constexpr int   DEFAULT_CONTEXT_SIZE = 8192;
constexpr int   BATCH_SIZE           = 512;
constexpr float DEFAULT_SAMPLER_TEMP = 0.3f;

// JNI Functions
JNIEXPORT void   JNICALL Java_com_arm_aichat_internal_InferenceEngineImpl_init(JNIEnv*, jobject, jstring nativeLibDir);
JNIEXPORT jint   JNICALL Java_com_arm_aichat_internal_InferenceEngineImpl_load(JNIEnv*, jobject, jstring jmodel_path);
JNIEXPORT void   JNICALL Java_com_arm_aichat_internal_InferenceEngineImpl_prepare(JNIEnv*, jobject);
JNIEXPORT jstring JNICALL Java_com_arm_aichat_internal_InferenceEngineImpl_systemInfo(JNIEnv*, jobject);
JNIEXPORT void   JNICALL Java_com_arm_aichat_internal_InferenceEngineImpl_processSystemPrompt(JNIEnv*, jobject, jstring);
JNIEXPORT void   JNICALL Java_com_arm_aichat_internal_InferenceEngineImpl_processUserPrompt(JNIEnv*, jobject, jstring);
JNIEXPORT jstring JNICALL Java_com_arm_aichat_internal_InferenceEngineImpl_generateTokens(JNIEnv*, jobject);
JNIEXPORT void   JNICALL Java_com_arm_aichat_internal_InferenceEngineImpl_cleanUp(JNIEnv*, jobject);
JNIEXPORT void   JNICALL Java_com_arm_aichat_internal_InferenceEngineImpl_destroy(JNIEnv*, jobject);

Import

#include <android/log.h>
#include <jni.h>
#include <sampling.h>
#include "logging.h"
#include "chat.h"
#include "common.h"
#include "llama.h"

I/O Contract

Inputs

Name Type Required Description
nativeLibDir jstring Yes Path to native library directory for backend loading
jmodel_path jstring Yes Path to the GGUF model file on the device
system_prompt jstring No System prompt to apply via chat template
user_prompt jstring Yes User prompt to process and generate a response for

Outputs

Name Type Description
return_code jint 0 on success, non-zero on failure (for load)
token jstring Next generated token string (for generateTokens), or empty string at end of sequence
system_info jstring System information string describing backend capabilities

Usage Examples

// Called from Kotlin via JNI - typical usage flow:

// 1. Initialize backends
Java_com_arm_aichat_internal_InferenceEngineImpl_init(env, obj, nativeLibDir);

// 2. Load a GGUF model
jint result = Java_com_arm_aichat_internal_InferenceEngineImpl_load(env, obj, modelPath);

// 3. Prepare context
Java_com_arm_aichat_internal_InferenceEngineImpl_prepare(env, obj);

// 4. Process prompts
Java_com_arm_aichat_internal_InferenceEngineImpl_processSystemPrompt(env, obj, systemPrompt);
Java_com_arm_aichat_internal_InferenceEngineImpl_processUserPrompt(env, obj, userPrompt);

// 5. Generate tokens in a loop
jstring token;
while ((token = Java_com_arm_aichat_internal_InferenceEngineImpl_generateTokens(env, obj)) != nullptr) {
    // process token
}

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment