Implementation:Mlc ai Mlc llm LLMEngine ObjC
Overview
The file ios/MLCSwift/Sources/ObjC/include/LLMEngine.h is an Objective-C header that defines the JSONFFIEngine interface. This class serves as the bridge between the Swift-based iOS frontend and the C++ JSON FFI Engine that powers MLC LLM inference on Apple devices. It exposes the core lifecycle and request-handling methods needed to run LLM inference from an iOS application.
Location
- Repository: Mlc_ai_Mlc_llm
- File:
ios/MLCSwift/Sources/ObjC/include/LLMEngine.h - Lines: 32
Purpose
This header is intended to be used as a bridging header that exposes Objective-C interfaces to Swift. As stated in the file's comment:
// Use this file to import your target's public headers that you would like to expose to Swift.
// LLM Chat Module
//
// Exposed interface of Object-C, enables swift binding.
The file imports both Foundation and UIKit frameworks, indicating it is designed exclusively for iOS (UIKit-based) applications.
JSONFFIEngine Interface
The JSONFFIEngine class inherits from NSObject and provides the following methods:
@interface JSONFFIEngine : NSObject
- (void)initBackgroundEngine:(void (^)(NSString*))streamCallback;
- (void)reload:(NSString*)engineConfig;
- (void)unload;
- (void)reset;
- (void)chatCompletion:(NSString*)requestJSON requestID:(NSString*)requestID;
- (void)abort:(NSString*)requestID;
- (void)runBackgroundLoop;
- (void)runBackgroundStreamBackLoop;
- (void)exitBackgroundLoop;
@end
Method Details
| Method | Description |
|---|---|
initBackgroundEngine: |
Initializes the background inference engine. Accepts a callback block that receives an NSString* parameter -- this callback is invoked whenever the engine produces streamed output tokens.
|
reload: |
Loads or reloads the engine with the specified configuration. The engineConfig parameter is a JSON string describing the model, quantization, and runtime settings.
|
unload |
Unloads the currently loaded model and releases associated resources. |
reset |
Resets the engine state, clearing any in-progress conversations or cached state. |
chatCompletion:requestID: |
Submits a chat completion request. The requestJSON parameter is a JSON string following the chat completion request format. The requestID is a unique identifier for tracking and cancelling the request.
|
abort: |
Aborts an in-progress request identified by requestID.
|
runBackgroundLoop |
Starts the main background processing loop that drives the inference engine. This is typically called on a dedicated background thread. |
runBackgroundStreamBackLoop |
Starts the background loop that streams results back to the caller via the callback registered in initBackgroundEngine:.
|
exitBackgroundLoop |
Signals the background loops to terminate, allowing for a clean shutdown of the engine. |
Architecture
The JSONFFIEngine follows a producer-consumer pattern with two background loops:
runBackgroundLoop-- Acts as the producer, driving the C++ inference engine to generate tokens.runBackgroundStreamBackLoop-- Acts as the consumer, forwarding generated tokens back to the application layer through the stream callback.
This two-loop design decouples the compute-intensive inference from the I/O of streaming results, which is critical for maintaining responsive UI on iOS devices.
Integration Notes
- The interface communicates entirely through JSON strings, keeping the Objective-C layer thin and avoiding complex type bridging between C++ and Swift.
- The
requestIDparameter onchatCompletion:requestID:andabort:enables multiplexed request handling, allowing the engine to process and cancel individual requests independently. - This header is part of the MLCSwift package and is placed in the ObjC include directory so that it is automatically available as a bridging header for Swift code in the same target.