Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Kserve Kserve Authentication Integration

From Leeroopedia
Knowledge Sources
Domains Security, Authentication, Networking
Last Updated 2026-02-13 00:00 GMT

Overview

A pattern for integrating identity provider authentication flows with inference service endpoints so that only authorized clients can invoke model predictions.

Description

Authentication Integration addresses the challenge of securing KServe inference endpoints behind identity-aware proxies and authentication gateways. Since KServe endpoints are exposed via Istio ingress or Kubernetes Gateway API, authentication can be layered in at the network edge without modifying model serving code.

KServe supports two primary authentication integration patterns:

  • Identity-Aware Proxy (IAP) -- used in GCP environments, where Google Cloud IAP intercepts requests before they reach the Istio ingress. Clients must obtain an OIDC token from Google and include it as a Bearer token. The IAP proxy validates the token and forwards authenticated requests.
  • Dex with Istio -- used in Kubeflow environments, where Dex acts as an OpenID Connect provider. Clients authenticate via Dex (LDAP, SAML, OAuth2), receive a session cookie or token, and Istio's RequestAuthentication policy validates the JWT.

Both patterns delegate authentication to external identity providers, keeping the inference service itself stateless with respect to user identity.

Usage

Use this principle when:

  • Deploying KServe on GCP with IAP-protected endpoints
  • Integrating with Kubeflow's Dex-based authentication
  • Building client applications that must authenticate before calling inference endpoints
  • Securing multi-tenant inference environments

Theoretical Basis

# Authentication flow patterns (NOT implementation code)
IAP Authentication:
  1. Client obtains OIDC token from Google Identity service
  2. Client sends request with Authorization: Bearer <token>
  3. GCP IAP validates token against configured OAuth client
  4. If valid → request forwarded to Istio ingress → InferenceService
  5. If invalid → 401 Unauthorized returned

Dex Authentication:
  1. Client authenticates with Dex (password, LDAP, SAML, etc.)
  2. Dex issues JWT/session token
  3. Client sends request with token (cookie or Authorization header)
  4. Istio RequestAuthentication validates JWT
  5. Istio AuthorizationPolicy checks claims (issuer, audience, groups)
  6. If valid → request forwarded to InferenceService
  7. If invalid → 403 Forbidden returned

Common pattern:
  Client → Identity Provider → Token → Edge Proxy → InferenceService
  (Authentication is external; InferenceService receives pre-authenticated requests)

Related Pages

Implemented By

Related Principles

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment