Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Alibaba MNN RapidJSON Encodings

From Leeroopedia


Knowledge Sources
Domains JSON_Parsing, Encoding
Last Updated 2026-02-10 12:00 GMT

Overview

3rd_party/rapidjson/encodings.h (716 lines) provides Unicode encoding support for the vendored RapidJSON library within Alibaba MNN. It defines template classes for UTF-8, UTF-16 (both LE and BE), UTF-32 (both LE and BE), and ASCII encodings. Each encoding class implements the Encode, Decode, and Validate operations required by the RapidJSON Encoding concept.

Usage note: Vendored dependency used internally by MNN for JSON configuration parsing (model configs, LLM configs). Not directly imported by end users.

Encoding Concept

All encoding classes conform to the following concept:

concept Encoding {
    typename Ch;  // Character type (code unit)

    enum { supportUnicode = 1 }; // or 0

    template<typename OutputStream>
    static void Encode(OutputStream& os, unsigned codepoint);

    template<typename InputStream>
    static bool Decode(InputStream& is, unsigned* codepoint);

    template<typename InputStream, typename OutputStream>
    static bool Validate(InputStream& is, OutputStream& os);
};

Key Classes

UTF8

UTF-8 encoding with configurable character type (defaults to char):

template<typename CharType = char>
struct UTF8 {
    typedef CharType Ch;
    enum { supportUnicode = 1 };

    template<typename OutputStream>
    static void Encode(OutputStream& os, unsigned codepoint);

    template<typename InputStream>
    static bool Decode(InputStream& is, unsigned* codepoint);
};

This is the default encoding used throughout MNN's JSON parsing, as all MNN configuration files are UTF-8.

UTF16

UTF-16 encoding supporting surrogate pairs for codepoints above U+FFFF:

template<typename CharType = wchar_t>
struct UTF16 {
    typedef CharType Ch;
    enum { supportUnicode = 1 };
};

Endian-specific variants UTF16LE and UTF16BE are also provided.

UTF32

UTF-32 encoding where each code unit is a full Unicode codepoint:

template<typename CharType = unsigned>
struct UTF32 {
    typedef CharType Ch;
    enum { supportUnicode = 1 };
};

ASCII

7-bit ASCII encoding (no Unicode support):

template<typename CharType = char>
struct ASCII {
    typedef CharType Ch;
    enum { supportUnicode = 0 };
};

Transcoder

The header also provides a Transcoder template for converting between encodings:

template<typename SourceEncoding, typename TargetEncoding>
struct Transcoder {
    template<typename InputStream, typename OutputStream>
    static RAPIDJSON_FORCEINLINE bool Transcode(InputStream& is, OutputStream& os);

    template<typename InputStream, typename OutputStream>
    static RAPIDJSON_FORCEINLINE bool Validate(InputStream& is, OutputStream& os);
};

A specialization exists for same-encoding transcoding (pass-through optimization).

Compiler Compatibility

Diagnostic suppression is applied for MSVC (C4244 conversion warnings, C4702 unreachable code) and GCC (effc++, overflow).

License

MIT License. Copyright (C) 2015 THL A29 Limited (Tencent) and Milo Yip.

See Also

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment