Implementation:SeldonIO Seldon core Seldon Model Infer BYTES

Field	Value
Type	External Tool Doc
Overview	Concrete CLI tool for sending text-based V2 inference requests with BYTES datatype to HuggingFace models.
Source	`samples/huggingface.md:L42-90`, `docs-gb/cli/seldon_model_infer.md:L1-35`
Domains	NLP, Inference
Implements Principle	SeldonIO_Seldon_core_HuggingFace_Text_Inference
External Dependencies	seldon CLI, curl, grpcurl, V2 protocol
Knowledge Sources	Repo (https://github.com/SeldonIO/seldon-core), Doc (https://docs.seldon.io/projects/seldon-core/en/v2/)
Last Updated	2026-02-13 00:00 GMT

Code Reference

Text Generation (REST)

seldon model infer text-gen \
  '{"inputs": [{"name": "args", "shape": [1], "datatype": "BYTES", "data": ["Once upon a time"]}]}'

Sentiment Analysis (REST)

seldon model infer sentiment \
  '{"inputs": [{"name": "args", "shape": [1], "datatype": "BYTES", "data": ["I love this product"]}]}'

Text Generation (gRPC with base64 encoding)

seldon model infer text-gen --inference-mode grpc \
  '{"inputs": [{"name": "args", "shape": [1], "datatype": "BYTES", "data": ["T25jZSB1cG9uIGEgdGltZQ=="]}]}'

Note: "T25jZSB1cG9uIGEgdGltZQ==" is the base64 encoding of "Once upon a time".

Key Parameters

Parameter	Example	Description
modelName	`text-gen`, `sentiment`, `whisper`	Target model name (positional argument)
inputs[].name	`"args"`	Input tensor name expected by HuggingFace runtime
inputs[].shape	`[1]`	Number of text strings in the batch
inputs[].datatype	`"BYTES"`	V2 datatype for variable-length text/binary data
inputs[].data	`["Once upon a time"]`	Array of text strings (plain for REST, base64 for gRPC)
--inference-mode	`"rest"` or `"grpc"`	Transport protocol (default: REST)

I/O Contract

Inputs

Input	Format	Description
V2 Inference Request	JSON	`{"inputs": [{"name": "args", "shape": [N], "datatype": "BYTES", "data": ["text1", "text2", ...]}]}`

Outputs

The output format depends on the target model type:

Model Type	Output Format	Example
text-gen	Generated text continuation	`{"outputs": [{"name": "output", "datatype": "BYTES", "data": ["Once upon a time there was a..."]}]}`
sentiment	Label and confidence score	`{"outputs": [{"name": "output", "datatype": "BYTES", "data": ["{\"label\": \"POSITIVE\", \"score\": 0.9998}"]}]}`
whisper	Transcribed text	`{"outputs": [{"name": "output", "datatype": "BYTES", "data": ["The transcribed speech text..."]}]}`

For gRPC responses, the output data is base64-encoded and must be decoded by the client.

Usage Examples

Batched sentiment inference

Send multiple text inputs in a single request:

seldon model infer sentiment \
  '{"inputs": [{"name": "args", "shape": [3], "datatype": "BYTES", "data": ["I love this product", "This is terrible", "Not bad at all"]}]}'

Using curl directly (REST)

curl -X POST http://localhost:9000/v2/models/sentiment/infer \
  -H "Content-Type: application/json" \
  -d '{"inputs": [{"name": "args", "shape": [1], "datatype": "BYTES", "data": ["I love this product"]}]}'

Using grpcurl directly (gRPC)

grpcurl -plaintext \
  -d '{"model_name": "text-gen", "inputs": [{"name": "args", "shape": [1], "datatype": "BYTES", "contents": {"bytes_contents": ["T25jZSB1cG9uIGEgdGltZQ=="]}}]}' \
  localhost:9001 inference.GRPCInferenceService/ModelInfer

Pipeline inference (via seldon pipeline infer)

When a model is part of a pipeline, use seldon pipeline infer instead:

seldon pipeline infer speech-to-sentiment \
  '{"inputs": [{"name": "args", "shape": [1], "datatype": "BYTES", "data": ["audio data here"]}]}'

Related Pages

SeldonIO_Seldon_core_HuggingFace_Text_Inference -- principle that this implementation realizes
SeldonIO_Seldon_core_Seldon_Model_Load_HuggingFace -- depends on deployed and verified models before inference can be performed
SeldonIO_Seldon_core_Seldon_Pipeline_CRD_Multi_Modal -- extends to pipeline-level inference for multi-modal workflows
SeldonIO_Seldon_core_Seldon_Model_Infer -- specializes the general model infer command for BYTES datatype
Environment:SeldonIO_Seldon_core_GPU_Inference_Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment