Implementation:Mlc ai Mlc llm LLMEngine ObjC

Overview

The file ios/MLCSwift/Sources/ObjC/include/LLMEngine.h is an Objective-C header that defines the JSONFFIEngine interface. This class serves as the bridge between the Swift-based iOS frontend and the C++ JSON FFI Engine that powers MLC LLM inference on Apple devices. It exposes the core lifecycle and request-handling methods needed to run LLM inference from an iOS application.

Location

Repository: Mlc_ai_Mlc_llm
File: ios/MLCSwift/Sources/ObjC/include/LLMEngine.h
Lines: 32

Purpose

This header is intended to be used as a bridging header that exposes Objective-C interfaces to Swift. As stated in the file's comment:

// Use this file to import your target's public headers that you would like to expose to Swift.
// LLM Chat Module
//
// Exposed interface of Object-C, enables swift binding.

The file imports both Foundation and UIKit frameworks, indicating it is designed exclusively for iOS (UIKit-based) applications.

JSONFFIEngine Interface

The JSONFFIEngine class inherits from NSObject and provides the following methods:

@interface JSONFFIEngine : NSObject

- (void)initBackgroundEngine:(void (^)(NSString*))streamCallback;
- (void)reload:(NSString*)engineConfig;
- (void)unload;
- (void)reset;
- (void)chatCompletion:(NSString*)requestJSON requestID:(NSString*)requestID;
- (void)abort:(NSString*)requestID;
- (void)runBackgroundLoop;
- (void)runBackgroundStreamBackLoop;
- (void)exitBackgroundLoop;

@end

Method Details

Method	Description
`initBackgroundEngine:`	Initializes the background inference engine. Accepts a callback block that receives an `NSString*` parameter -- this callback is invoked whenever the engine produces streamed output tokens.
`reload:`	Loads or reloads the engine with the specified configuration. The `engineConfig` parameter is a JSON string describing the model, quantization, and runtime settings.
`unload`	Unloads the currently loaded model and releases associated resources.
`reset`	Resets the engine state, clearing any in-progress conversations or cached state.
`chatCompletion:requestID:`	Submits a chat completion request. The `requestJSON` parameter is a JSON string following the chat completion request format. The `requestID` is a unique identifier for tracking and cancelling the request.
`abort:`	Aborts an in-progress request identified by `requestID`.
`runBackgroundLoop`	Starts the main background processing loop that drives the inference engine. This is typically called on a dedicated background thread.
`runBackgroundStreamBackLoop`	Starts the background loop that streams results back to the caller via the callback registered in `initBackgroundEngine:`.
`exitBackgroundLoop`	Signals the background loops to terminate, allowing for a clean shutdown of the engine.

Architecture

The JSONFFIEngine follows a producer-consumer pattern with two background loops:

runBackgroundLoop -- Acts as the producer, driving the C++ inference engine to generate tokens.
runBackgroundStreamBackLoop -- Acts as the consumer, forwarding generated tokens back to the application layer through the stream callback.

This two-loop design decouples the compute-intensive inference from the I/O of streaming results, which is critical for maintaining responsive UI on iOS devices.

Integration Notes

The interface communicates entirely through JSON strings, keeping the Objective-C layer thin and avoiding complex type bridging between C++ and Swift.
The requestID parameter on chatCompletion:requestID: and abort: enables multiplexed request handling, allowing the engine to process and cancel individual requests independently.
This header is part of the MLCSwift package and is placed in the ObjC include directory so that it is automatically available as a bridging header for Swift code in the same target.

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment