Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:SeldonIO Seldon core Seldon Model Infer BYTES

From Leeroopedia
Revision as of 13:50, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/SeldonIO_Seldon_core_Seldon_Model_Infer_BYTES.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Field Value
Type External Tool Doc
Overview Concrete CLI tool for sending text-based V2 inference requests with BYTES datatype to HuggingFace models.
Source samples/huggingface.md:L42-90, docs-gb/cli/seldon_model_infer.md:L1-35
Domains NLP, Inference
Implements Principle SeldonIO_Seldon_core_HuggingFace_Text_Inference
External Dependencies seldon CLI, curl, grpcurl, V2 protocol
Knowledge Sources Repo (https://github.com/SeldonIO/seldon-core), Doc (https://docs.seldon.io/projects/seldon-core/en/v2/)
Last Updated 2026-02-13 00:00 GMT

Code Reference

Text Generation (REST)

seldon model infer text-gen \
  '{"inputs": [{"name": "args", "shape": [1], "datatype": "BYTES", "data": ["Once upon a time"]}]}'

Sentiment Analysis (REST)

seldon model infer sentiment \
  '{"inputs": [{"name": "args", "shape": [1], "datatype": "BYTES", "data": ["I love this product"]}]}'

Text Generation (gRPC with base64 encoding)

seldon model infer text-gen --inference-mode grpc \
  '{"inputs": [{"name": "args", "shape": [1], "datatype": "BYTES", "data": ["T25jZSB1cG9uIGEgdGltZQ=="]}]}'

Note: "T25jZSB1cG9uIGEgdGltZQ==" is the base64 encoding of "Once upon a time".

Key Parameters

Parameter Example Description
modelName text-gen, sentiment, whisper Target model name (positional argument)
inputs[].name "args" Input tensor name expected by HuggingFace runtime
inputs[].shape [1] Number of text strings in the batch
inputs[].datatype "BYTES" V2 datatype for variable-length text/binary data
inputs[].data ["Once upon a time"] Array of text strings (plain for REST, base64 for gRPC)
--inference-mode "rest" or "grpc" Transport protocol (default: REST)

I/O Contract

Inputs

Input Format Description
V2 Inference Request JSON {"inputs": [{"name": "args", "shape": [N], "datatype": "BYTES", "data": ["text1", "text2", ...]}]}

Outputs

The output format depends on the target model type:

Model Type Output Format Example
text-gen Generated text continuation {"outputs": [{"name": "output", "datatype": "BYTES", "data": ["Once upon a time there was a..."]}]}
sentiment Label and confidence score {"outputs": [{"name": "output", "datatype": "BYTES", "data": ["{\"label\": \"POSITIVE\", \"score\": 0.9998}"]}]}
whisper Transcribed text {"outputs": [{"name": "output", "datatype": "BYTES", "data": ["The transcribed speech text..."]}]}

For gRPC responses, the output data is base64-encoded and must be decoded by the client.

Usage Examples

Batched sentiment inference

Send multiple text inputs in a single request:

seldon model infer sentiment \
  '{"inputs": [{"name": "args", "shape": [3], "datatype": "BYTES", "data": ["I love this product", "This is terrible", "Not bad at all"]}]}'

Using curl directly (REST)

curl -X POST http://localhost:9000/v2/models/sentiment/infer \
  -H "Content-Type: application/json" \
  -d '{"inputs": [{"name": "args", "shape": [1], "datatype": "BYTES", "data": ["I love this product"]}]}'

Using grpcurl directly (gRPC)

grpcurl -plaintext \
  -d '{"model_name": "text-gen", "inputs": [{"name": "args", "shape": [1], "datatype": "BYTES", "contents": {"bytes_contents": ["T25jZSB1cG9uIGEgdGltZQ=="]}}]}' \
  localhost:9001 inference.GRPCInferenceService/ModelInfer

Pipeline inference (via seldon pipeline infer)

When a model is part of a pipeline, use seldon pipeline infer instead:

seldon pipeline infer speech-to-sentiment \
  '{"inputs": [{"name": "args", "shape": [1], "datatype": "BYTES", "data": ["audio data here"]}]}'

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment