Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:InternLM Lmdeploy APIClient

From Leeroopedia


Knowledge Sources
Domains LLM_Serving, Client_SDK
Last Updated 2026-02-07 15:00 GMT

Overview

Concrete tool for connecting to LMDeploy API servers using a Python client provided by the LMDeploy library.

Description

The APIClient class provides a Python interface for consuming LMDeploy's OpenAI-compatible HTTP API. It wraps the requests library and supports both streaming and non-streaming chat completions and text completions. As an alternative, users can use the standard OpenAI Python SDK by pointing base_url to the LMDeploy server.

Usage

Import this class when building Python applications that need to communicate with a running LMDeploy API server. Use the OpenAI SDK alternative for drop-in compatibility with existing OpenAI-based code.

Code Reference

Source Location

  • Repository: lmdeploy
  • File: lmdeploy/serve/openai/api_client.py
  • Lines: L38-58 (init), L90-173 (chat_completions_v1), L175-196 (completions_v1)

Signature

class APIClient:
    def __init__(self, api_server_url: str, api_key: Optional[str] = None):
        ...

    def chat_completions_v1(self,
                            model: str,
                            messages: Union[str, List[Dict]],
                            temperature: float = 0.7,
                            top_p: float = 1.0,
                            n: int = 1,
                            max_tokens: int = None,
                            stop: Optional[Union[str, List[str]]] = None,
                            stream: bool = False,
                            **kwargs) -> Iterator[dict]:
        ...

    def completions_v1(self,
                       model: str,
                       prompt: str,
                       **kwargs) -> Iterator[dict]:
        ...

Import

from lmdeploy.serve.openai.api_client import APIClient

I/O Contract

Inputs

Name Type Required Description
api_server_url str Yes URL of running LMDeploy server (e.g., 'http://localhost:23333')
api_key str No Authentication key
model str Yes Model name (from /v1/models)
messages str or List[Dict] Yes OpenAI-format messages
temperature float No Sampling temperature (default: 0.7)
stream bool No Enable streaming (default: False)

Outputs

Name Type Description
response Iterator[dict] OpenAI-format JSON response dicts with choices[0].message.content

Usage Examples

Native APIClient

from lmdeploy.serve.openai.api_client import APIClient

client = APIClient('http://localhost:23333')

# Non-streaming
for response in client.chat_completions_v1(
    model='internlm2_5-7b-chat',
    messages=[{"role": "user", "content": "Hello!"}],
    stream=False
):
    print(response['choices'][0]['message']['content'])

OpenAI SDK Alternative

from openai import OpenAI

client = OpenAI(
    base_url='http://localhost:23333/v1',
    api_key='your-key'
)

response = client.chat.completions.create(
    model='internlm2_5-7b-chat',
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)

Related Pages

Implements Principle

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment