Implementation:SeldonIO Seldon core Seldon Model Infer BYTES
Appearance
| Field | Value |
|---|---|
| Type | External Tool Doc |
| Overview | Concrete CLI tool for sending text-based V2 inference requests with BYTES datatype to HuggingFace models. |
| Source | samples/huggingface.md:L42-90, docs-gb/cli/seldon_model_infer.md:L1-35
|
| Domains | NLP, Inference |
| Implements Principle | SeldonIO_Seldon_core_HuggingFace_Text_Inference |
| External Dependencies | seldon CLI, curl, grpcurl, V2 protocol |
| Knowledge Sources | Repo (https://github.com/SeldonIO/seldon-core), Doc (https://docs.seldon.io/projects/seldon-core/en/v2/) |
| Last Updated | 2026-02-13 00:00 GMT |
Code Reference
Text Generation (REST)
seldon model infer text-gen \
'{"inputs": [{"name": "args", "shape": [1], "datatype": "BYTES", "data": ["Once upon a time"]}]}'
Sentiment Analysis (REST)
seldon model infer sentiment \
'{"inputs": [{"name": "args", "shape": [1], "datatype": "BYTES", "data": ["I love this product"]}]}'
Text Generation (gRPC with base64 encoding)
seldon model infer text-gen --inference-mode grpc \
'{"inputs": [{"name": "args", "shape": [1], "datatype": "BYTES", "data": ["T25jZSB1cG9uIGEgdGltZQ=="]}]}'
Note: "T25jZSB1cG9uIGEgdGltZQ==" is the base64 encoding of "Once upon a time".
Key Parameters
| Parameter | Example | Description |
|---|---|---|
| modelName | text-gen, sentiment, whisper |
Target model name (positional argument) |
| inputs[].name | "args" |
Input tensor name expected by HuggingFace runtime |
| inputs[].shape | [1] |
Number of text strings in the batch |
| inputs[].datatype | "BYTES" |
V2 datatype for variable-length text/binary data |
| inputs[].data | ["Once upon a time"] |
Array of text strings (plain for REST, base64 for gRPC) |
| --inference-mode | "rest" or "grpc" |
Transport protocol (default: REST) |
I/O Contract
Inputs
| Input | Format | Description |
|---|---|---|
| V2 Inference Request | JSON | {"inputs": [{"name": "args", "shape": [N], "datatype": "BYTES", "data": ["text1", "text2", ...]}]}
|
Outputs
The output format depends on the target model type:
| Model Type | Output Format | Example |
|---|---|---|
| text-gen | Generated text continuation | {"outputs": [{"name": "output", "datatype": "BYTES", "data": ["Once upon a time there was a..."]}]}
|
| sentiment | Label and confidence score | {"outputs": [{"name": "output", "datatype": "BYTES", "data": ["{\"label\": \"POSITIVE\", \"score\": 0.9998}"]}]}
|
| whisper | Transcribed text | {"outputs": [{"name": "output", "datatype": "BYTES", "data": ["The transcribed speech text..."]}]}
|
For gRPC responses, the output data is base64-encoded and must be decoded by the client.
Usage Examples
Batched sentiment inference
Send multiple text inputs in a single request:
seldon model infer sentiment \
'{"inputs": [{"name": "args", "shape": [3], "datatype": "BYTES", "data": ["I love this product", "This is terrible", "Not bad at all"]}]}'
Using curl directly (REST)
curl -X POST http://localhost:9000/v2/models/sentiment/infer \
-H "Content-Type: application/json" \
-d '{"inputs": [{"name": "args", "shape": [1], "datatype": "BYTES", "data": ["I love this product"]}]}'
Using grpcurl directly (gRPC)
grpcurl -plaintext \
-d '{"model_name": "text-gen", "inputs": [{"name": "args", "shape": [1], "datatype": "BYTES", "contents": {"bytes_contents": ["T25jZSB1cG9uIGEgdGltZQ=="]}}]}' \
localhost:9001 inference.GRPCInferenceService/ModelInfer
Pipeline inference (via seldon pipeline infer)
When a model is part of a pipeline, use seldon pipeline infer instead:
seldon pipeline infer speech-to-sentiment \
'{"inputs": [{"name": "args", "shape": [1], "datatype": "BYTES", "data": ["audio data here"]}]}'
Related Pages
- SeldonIO_Seldon_core_HuggingFace_Text_Inference -- principle that this implementation realizes
- SeldonIO_Seldon_core_Seldon_Model_Load_HuggingFace -- depends on deployed and verified models before inference can be performed
- SeldonIO_Seldon_core_Seldon_Pipeline_CRD_Multi_Modal -- extends to pipeline-level inference for multi-modal workflows
- SeldonIO_Seldon_core_Seldon_Model_Infer -- specializes the general model infer command for BYTES datatype
- Environment:SeldonIO_Seldon_core_GPU_Inference_Environment
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment