Implementation:InternLM Lmdeploy APIClient
| Knowledge Sources | |
|---|---|
| Domains | LLM_Serving, Client_SDK |
| Last Updated | 2026-02-07 15:00 GMT |
Overview
Concrete tool for connecting to LMDeploy API servers using a Python client provided by the LMDeploy library.
Description
The APIClient class provides a Python interface for consuming LMDeploy's OpenAI-compatible HTTP API. It wraps the requests library and supports both streaming and non-streaming chat completions and text completions. As an alternative, users can use the standard OpenAI Python SDK by pointing base_url to the LMDeploy server.
Usage
Import this class when building Python applications that need to communicate with a running LMDeploy API server. Use the OpenAI SDK alternative for drop-in compatibility with existing OpenAI-based code.
Code Reference
Source Location
- Repository: lmdeploy
- File: lmdeploy/serve/openai/api_client.py
- Lines: L38-58 (init), L90-173 (chat_completions_v1), L175-196 (completions_v1)
Signature
class APIClient:
def __init__(self, api_server_url: str, api_key: Optional[str] = None):
...
def chat_completions_v1(self,
model: str,
messages: Union[str, List[Dict]],
temperature: float = 0.7,
top_p: float = 1.0,
n: int = 1,
max_tokens: int = None,
stop: Optional[Union[str, List[str]]] = None,
stream: bool = False,
**kwargs) -> Iterator[dict]:
...
def completions_v1(self,
model: str,
prompt: str,
**kwargs) -> Iterator[dict]:
...
Import
from lmdeploy.serve.openai.api_client import APIClient
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| api_server_url | str | Yes | URL of running LMDeploy server (e.g., 'http://localhost:23333') |
| api_key | str | No | Authentication key |
| model | str | Yes | Model name (from /v1/models) |
| messages | str or List[Dict] | Yes | OpenAI-format messages |
| temperature | float | No | Sampling temperature (default: 0.7) |
| stream | bool | No | Enable streaming (default: False) |
Outputs
| Name | Type | Description |
|---|---|---|
| response | Iterator[dict] | OpenAI-format JSON response dicts with choices[0].message.content |
Usage Examples
Native APIClient
from lmdeploy.serve.openai.api_client import APIClient
client = APIClient('http://localhost:23333')
# Non-streaming
for response in client.chat_completions_v1(
model='internlm2_5-7b-chat',
messages=[{"role": "user", "content": "Hello!"}],
stream=False
):
print(response['choices'][0]['message']['content'])
OpenAI SDK Alternative
from openai import OpenAI
client = OpenAI(
base_url='http://localhost:23333/v1',
api_key='your-key'
)
response = client.chat.completions.create(
model='internlm2_5-7b-chat',
messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)