Principle:Kserve Kserve Authentication Integration
| Knowledge Sources | |
|---|---|
| Domains | Security, Authentication, Networking |
| Last Updated | 2026-02-13 00:00 GMT |
Overview
A pattern for integrating identity provider authentication flows with inference service endpoints so that only authorized clients can invoke model predictions.
Description
Authentication Integration addresses the challenge of securing KServe inference endpoints behind identity-aware proxies and authentication gateways. Since KServe endpoints are exposed via Istio ingress or Kubernetes Gateway API, authentication can be layered in at the network edge without modifying model serving code.
KServe supports two primary authentication integration patterns:
- Identity-Aware Proxy (IAP) -- used in GCP environments, where Google Cloud IAP intercepts requests before they reach the Istio ingress. Clients must obtain an OIDC token from Google and include it as a Bearer token. The IAP proxy validates the token and forwards authenticated requests.
- Dex with Istio -- used in Kubeflow environments, where Dex acts as an OpenID Connect provider. Clients authenticate via Dex (LDAP, SAML, OAuth2), receive a session cookie or token, and Istio's RequestAuthentication policy validates the JWT.
Both patterns delegate authentication to external identity providers, keeping the inference service itself stateless with respect to user identity.
Usage
Use this principle when:
- Deploying KServe on GCP with IAP-protected endpoints
- Integrating with Kubeflow's Dex-based authentication
- Building client applications that must authenticate before calling inference endpoints
- Securing multi-tenant inference environments
Theoretical Basis
# Authentication flow patterns (NOT implementation code)
IAP Authentication:
1. Client obtains OIDC token from Google Identity service
2. Client sends request with Authorization: Bearer <token>
3. GCP IAP validates token against configured OAuth client
4. If valid → request forwarded to Istio ingress → InferenceService
5. If invalid → 401 Unauthorized returned
Dex Authentication:
1. Client authenticates with Dex (password, LDAP, SAML, etc.)
2. Dex issues JWT/session token
3. Client sends request with token (cookie or Authorization header)
4. Istio RequestAuthentication validates JWT
5. Istio AuthorizationPolicy checks claims (issuer, audience, groups)
6. If valid → request forwarded to InferenceService
7. If invalid → 403 Forbidden returned
Common pattern:
Client → Identity Provider → Token → Edge Proxy → InferenceService
(Authentication is external; InferenceService receives pre-authenticated requests)