Running AI agents on Kubernetes is solved. Governing them is not, and KubeCon EU 2026 made that consensus explicit. Platform teams know how to deploy a workload. The challenge with agents is that once deployed, they do things you cannot fully predict: call external tools, spawn sub-agents, execute multi-step plans across hours. A compromised agent is not an exposed endpoint. It is an autonomous process with cluster credentials and a mandate to take action.
The infrastructure answer is a four-layer stack: cryptographic identity, protocol-aware gateways, Kubernetes admission control, and platform conformance validation. Each layer addresses a distinct failure mode. This post maps the architecture, connects commercial and open-source tools to each layer, and shows how the stack mitigates the OWASP MCP Top 10 risks at the infrastructure level.
The Four-Layer Security Stack
Before examining each layer in detail, here is the complete picture:
graph TB
subgraph Conformance["Layer 4: Conformance"]
C["CNCF KARs Certification
Agentic Workload Validation"]
end
subgraph Admission["Layer 3: Admission Control"]
A["Kyverno ValidatingPolicy
OPA / Gatekeeper"]
end
subgraph Gateway["Layer 2: Agent Gateway"]
G["Protocol-Aware Policy Enforcement
MCP / A2A / REST / gRPC"]
end
subgraph Identity["Layer 1: Identity"]
I["SPIFFE ID + X.509 Certificate
Certificate-Bound Tokens (DPoP)"]
end
Conformance --> Admission
Admission --> Gateway
Gateway --> Identity
Each layer protects a different phase of the agent lifecycle: who the agent is (identity), what traffic it can send and receive (gateway), what gets deployed at all (admission), and whether the platform itself reliably supports agentic workloads (conformance).
The layers are not redundant. Identity answers “who is this agent?” The gateway answers “what is this agent allowed to do right now?” Admission control answers “should this deployment be allowed into the cluster?” Conformance answers “does this Kubernetes distribution reliably support agent workloads as a class?” Removing any one layer creates a gap the others cannot fill.
Layer 1: Agent Identity
The fundamental problem with applying traditional service accounts to agents is shared identity. A service account is issued to a workload class, not a workload instance. Two pods running the same agent image share the same credentials. For a stateless API server, that is fine. For an autonomous agent, it means you cannot distinguish which specific agent instance made a call, revoke access for a single compromised agent without disrupting others, or detect when an agent’s identity is being impersonated.
Google Cloud Agent Identity (Preview)
Google’s Agent Identity, part of the Gemini Enterprise Agent Platform, addresses this by assigning each agent instance a unique SPIFFE ID following this format:
spiffe://myorg.example.com/resources/agent-engine/agents/my-agent
Along with the SPIFFE ID, each agent receives an auto-provisioned X.509 certificate with a 24-hour validity window. The certificate is not just for identification: access tokens are cryptographically bound to it, so a stolen token cannot be replayed without the corresponding certificate private key. This is DPoP (Demonstrable Proof of Possession) enforced at the gateway layer.
The key security properties, per the official documentation:
- Agent identities cannot be shared across workloads by default
- They cannot be impersonated
- Long-lived service account keys are not supported
- Authorization integrates with IAM allow/deny policies and Principal Access Boundary (PAB) to constrain which resources an agent can even attempt to access
Status: The Agent Identity auth manager is currently in Preview, operating under pre-GA terms with limited support.
Open-Source Identity Alternatives
The SPIFFE-based pattern works outside GCP. SPIRE (the SPIFFE Runtime Environment) provides SPIFFE identity on any Kubernetes cluster using the same model: each workload instance gets a unique SPIFFE ID and a short-lived X.509 SVID (SPIFFE Verifiable Identity Document).
Microsoft’s Agent Governance Toolkit (MIT-licensed, released April 2026) takes a parallel approach via its Agent Mesh component, using Decentralized Identifiers (DIDs) with Ed25519 key pairs. The trust model adds behavioral scoring on top of cryptographic identity: each agent maintains a score from 0 to 1000 across five behavioral tiers, so authorization decisions can factor in observed behavior, not just asserted identity.
The pattern is consistent across vendors: every agent gets a cryptographically verifiable identity tied to a specific instance, not a shared credential class.
Agent Identity Certificate Flow
sequenceDiagram
participant A as Agent Runtime
participant IS as Identity Service
participant GW as Agent Gateway
participant T as Tool / MCP Server
A->>IS: Request SPIFFE identity
IS-->>A: SPIFFE ID + X.509 cert (24hr TTL)
A->>IS: Request access token
IS-->>A: Certificate-bound token (DPoP)
A->>GW: Connect via mTLS + DPoP token
GW->>GW: Verify cert chain + DPoP proof
GW-->>A: Authorized session established
A->>T: Access tool via authorized session
T-->>A: Response
The certificate-bound token model means a stolen token is useless without the corresponding certificate private key. DPoP enforcement at the gateway validates that the connecting agent actually controls the certificate, not just possesses the token.
Layer 2: The Agent Gateway
Agent identity establishes who an agent is. The gateway layer controls what that agent can actually reach, with protocol-level awareness of agentic traffic patterns.
Standard API gateways inspect HTTP headers and apply rate limits. Agent gateways inspect MCP tool calls, A2A agent communications, and agentic protocol attributes. That distinction matters because an agent’s HTTP headers might look fine while the MCP tool invocation it contains violates least-privilege. You need a gateway that understands the protocol to enforce policy at the right layer.
Google Cloud Agent Gateway (Private Preview)
The Agent Gateway operates in two modes:
- Client-to-Agent (ingress): Secures external clients communicating with agents on Google Cloud
- Agent-to-Anywhere (egress): Secures agents communicating with external MCP servers, other agents, or APIs
Protocol support covers MCP, A2A, REST, and gRPC with protocol mediation, allowing an agent using one protocol to communicate with a tool expecting another.
graph LR
subgraph Ingress["Ingress Path"]
EC[External Client] -->|"mTLS + IAP"| GW1[Agent Gateway]
GW1 -->|"IAM policy check"| AR[Agent Runtime]
end
subgraph Egress["Egress Path"]
AR2[Agent Runtime] -->|"egress request"| GW2[Agent Gateway]
GW2 --> MA{Model Armor}
MA -->|"clean"| Tool["MCP Server / API"]
MA -->|"injection detected"| Blocked[Request Blocked]
end
The gateway mediates both directions: external clients cannot reach agents without passing IAP identity verification, and agents cannot reach tools without passing authorization policy and Model Armor content inspection.
IAP (Identity-Aware Proxy) is always enabled by default and cannot be disabled, though it can run in audit-only mode during rollout. Model Armor provides runtime protection against prompt injection and data leakage by inspecting traffic at the gateway layer, so individual agents do not need to implement injection defenses themselves.
A current limitation: authorization policy conditions based on agentic protocol attributes are only supported for MCP, not A2A or other protocols. Agent Gateway is in Private Preview.
Solo.io agentgateway (Open Source)
For non-GCP environments, Solo.io’s agentgateway provides equivalent capabilities as open-source software. Built in Rust, it implements the Kubernetes Gateway API (HTTPRoute, GRPCRoute, TCPRoute, TLSRoute) and provides deep MCP and A2A protocol awareness.
Authorization options include a native Cedar policy engine for fine-grained decisions and External Authorization (ExtAuthz) hooks compatible with OPA and Kyverno. In a Kubernetes ambient mesh, agentgateway deploys as a waypoint proxy handling ingress, egress, or service mesh gateway roles for agent traffic. Solo.io also ships kagent, a companion Kubernetes-native framework for deploying autonomous AI agents that integrates with LLM providers, MCP tools, and other agents.
This means you can run the complete four-layer security model on EKS, GKE, or bare metal: swap Agent Gateway for agentgateway and Agent Identity for SPIRE.
The OPA-Based Gateway Pattern
For teams that need highly expressive authorization logic, a gateway layer built on OPA/Rego provides the most flexibility. The InfoQ reference architecture structures this as three tiers:
- Gateway tier (TypeScript/MCP): Request validation and tool discovery. Agents never interact directly with infrastructure APIs.
- Policy tier (OPA/Rego): Authorization decisions with full Rego expressiveness.
- Execution tier (Python/Kubernetes): Approved actions running in ephemeral namespaces with automatic cleanup.
Here is the core Rego policy demonstrating least-privilege agent authorization:
package agent.authz
default allow = false
allow {
input.action == "apply_infra"
allow_actor[input.actor.id][input.plan.env]
plan_is_registered[input.plan.hash]
not is_destroy_plan(input.plan.path)
in_change_window(time.parse_rfc3339_ns(input.time))
}
allow_actor["sre-bot"] = {"dev", "staging", "prod"}
allow_actor["deploy-bot"] = {"dev", "staging"}
Every action passes through the policy engine before execution. The target decision latency is under 100ms, keeping authorization invisible to the agent. Each approved action runs in a dedicated Kubernetes namespace that is cleaned up after completion. No persistent execution footprint.
Layer 3: Admission Control
Gateway policies control runtime behavior. Admission control answers a prior question: should this agent workload be allowed into the cluster at all?
Kubernetes admission webhooks intercept every resource creation event. Two concerns matter most for agent workloads: blocking non-compliant agent deployments at the moment of creation, and scanning existing workloads for compliance drift after policies are introduced.
Kyverno ValidatingPolicy
Kyverno 1.17 promoted ValidatingPolicy to v1 GA. ValidatingPolicy uses CEL expressions via matchConstraints and validations fields, extended with additional Kyverno-specific libraries beyond the standard Kubernetes CEL set.
Here is a policy that blocks any pod in an AI agent namespace without an explicit approval label - the infrastructure control that directly mitigates OWASP MCP09 (shadow MCP servers):
apiVersion: policies.kyverno.io/v1
kind: ValidatingPolicy
metadata:
name: require-agent-approval-label
spec:
matchConstraints:
resourceRules:
- apiGroups: [""]
apiVersions: ["v1"]
resources: ["pods"]
operations: ["CREATE"]
namespaceSelector:
matchLabels:
workload-type: ai-agent
validations:
- expression: "has(object.metadata.labels) && has(object.metadata.labels['agent.kaden-projects.com/approved'])"
message: "Agent pods must carry the agent.kaden-projects.com/approved label. Submit for registry approval first."
This policy fires at CREATE time. A pod without the agent.kaden-projects.com/approved label never schedules. You can extend this pattern to require signed images, specific securityContext settings, or verified SPIFFE identity annotations.
Kyverno also ships PolicyReport for compliance visibility. Background scans surface existing workloads that violate policy without blocking live traffic - critical for catching drift after a policy is introduced to a running cluster.
graph TD
PR[Pod CREATE request] --> KV{Kyverno ValidatingPolicy}
KV -->|"Missing approval label"| Deny[Admission Denied]
KV -->|"Label present"| SI{Image signed?}
SI -->|"Unsigned image"| Deny
SI -->|"Signed + labeled"| Admit[Pod admitted to cluster]
Admit --> GW{Agent Gateway}
GW -->|"OPA / Cedar authorization"| Tool[Tool access granted]
GW -->|"Policy violation"| Blocked[Runtime request blocked]
Kyverno enforces admission-time policy before pods schedule. The gateway enforces runtime policy after they are running. Admission stops bad deployments. The gateway stops bad behavior from good deployments.
OPA/Gatekeeper for Complex Authorization
OPA Gatekeeper provides Rego-based admission for scenarios that require more expressive logic than CEL covers. The practical split on most platforms: Kyverno at admission (what gets deployed), OPA at the gateway (what those deployments are permitted to do at runtime). agentgateway’s ExtAuthz support means the same Rego policy engine you use for gateway authorization can be reused at admission, keeping policy logic consistent across both layers.
Layer 4: Platform Conformance
Identity, gateways, and admission control secure individual agent workloads. Platform conformance validates whether the Kubernetes distribution itself can reliably support agentic workloads as a class. A platform that cannot guarantee stable scheduling, resource isolation, and in-place resize for long-running agent tasks will fail under production agentic load regardless of how well you’ve configured the upper layers.
CNCF KARs v1.35 and Agentic Workload Validation
The CNCF Certified Kubernetes AI program (KARs - Kubernetes AI Requirements) grew from 18 to 31 certified platforms as of March 2026, announced at KubeCon EU. The v1.35 specification covers three workload categories:
- Training: Distributed jobs with accelerators
- Inference: Model serving with latency, routing, and scaling requirements
- Agentic: Multi-step workflows combining tools, memory, and long-running tasks
New requirements added in v1.35:
- KAR-10: High-Performance Pod-to-Pod Communication
- KAR-11: Advanced Inference Ingress
- KAR-41: Disaggregated Inference Support
The v1.35 mandatory alignment includes Stable In-Place Pod Resizing - the ability to adjust CPU and memory limits without restarting a pod. For agents, this matters because a long-running agent task that shifts between low-resource planning phases and high-resource execution phases can adapt its resource allocation without interruption. Workload-Aware Scheduling prevents resource deadlocks during distributed agent coordination.
The agentic category validates that certified platforms support complex, multi-step AI agents using the same sandbox isolation mechanisms Kubernetes already provides. The 2026 roadmap replaces self-assessment with a specialized “Verify Conformance Bot” for automated, third-party validation.
For platform teams: KARs certification gives you documented guarantees that agent workloads will schedule, scale, and isolate correctly. Without that certification, you are making assumptions about behavior that may not hold under production agentic load.
Putting It Together
GCP-Native vs Open-Source Stack
| Layer | GCP-Native | Open-Source (Any Kubernetes) |
|---|---|---|
| Identity | Google Agent Identity (SPIFFE + X.509, DPoP) - Preview | SPIRE / Microsoft Agent Governance Toolkit (DIDs) |
| Gateway | Google Agent Gateway (MCP, A2A, REST, gRPC) - Private Preview | Solo.io agentgateway (Rust, Cedar / OPA ExtAuthz) |
| Admission | Kyverno ValidatingPolicy v1 GA | Kyverno ValidatingPolicy v1 GA |
| Conformance | KARs-certified GKE | KARs self-assessment or certified distribution |
The admission layer is identical in both stacks. Kyverno 1.17 ValidatingPolicy works the same on GKE, EKS, or AKS. The differentiation is in the identity and gateway layers, where GCP provides managed, integrated services and the open-source stack requires assembling components.
Mapping OWASP MCP Top 10 to Infrastructure Controls
The OWASP MCP Top 10 identifies the most critical security risks in MCP-enabled agent systems. Here is how the four-layer stack addresses the highest-impact risks:
| OWASP MCP Risk | Infrastructure-Layer Control | Layer |
|---|---|---|
| MCP01: Token Mismanagement | Certificate-bound tokens (DPoP) via Agent Identity. Stolen tokens cannot be replayed without the corresponding X.509 private key. | Identity |
| MCP02: Excessive Permissions | Agent Gateway + least-privilege IAM policies. Each agent identity restricted to specific tools via policy conditions. | Gateway |
| MCP07: Insufficient Auth | mTLS + DPoP enforcement at the gateway. SPIFFE-based identity ensures every connection is mutually authenticated. | Identity + Gateway |
| MCP09: Shadow MCP Servers | Kyverno ValidatingPolicy blocks unlabeled agent pods at admission. No approved label, no scheduling. | Admission |
The MCP01 and MCP07 mitigations require both Layer 1 and Layer 2. Certificate-bound tokens are only meaningful when the gateway enforces DPoP proof-of-possession. The identity and gateway layers reinforce each other.
What Is Still Missing
This four-layer model addresses the known attack surface well. Several gaps remain open:
A2A protocol-level policy. Agent Gateway supports MCP attribute-level authorization conditions. The A2A protocol does not yet have equivalent attribute support. Agent-to-agent communication has less granular policy enforcement available today.
Cross-cloud agent identity federation. SPIFFE handles identity within a cluster or trust domain. Agents that need to call services across cloud providers require manual federation setup. There is no standardized cross-cloud agent identity chain.
Automated KARs conformance. The current program uses self-assessment. The planned Verify Conformance Bot is on the 2026 roadmap but not yet available. Platform teams should run manual conformance checks against the published KARs v1.35 specification.
Runtime behavioral anomaly detection. Google’s Security Command Center is adding automatic discovery of unmanaged agentic workloads and an agent security dashboard, but these features are in preview and cover GCP-hosted workloads. General-purpose behavioral anomaly detection for agent workloads - detecting when an agent deviates from expected behavior patterns mid-task - remains an open problem across the industry.
Frequently Asked Questions
What is Agent Identity and how does it differ from a Kubernetes service account?
Agent Identity assigns each AI agent a unique SPIFFE-based cryptographic ID with an auto-provisioned X.509 certificate (24-hour validity, auto-refreshed). Unlike service accounts, agent identities cannot be shared across workloads, cannot be impersonated, and do not support long-lived keys. Access tokens are cryptographically bound to the certificate, so a stolen token cannot be replayed without the corresponding certificate private key. This eliminates the credential-sharing problem inherent in service accounts.
Can I secure AI agents on Kubernetes without Google Cloud?
Yes. The four-layer pattern works on any Kubernetes cluster. Use SPIRE for SPIFFE identity, Solo.io’s agentgateway for MCP/A2A-aware traffic control with Cedar or OPA authorization, Kyverno ValidatingPolicy for admission control, and the CNCF KARs self-assessment for conformance validation. Microsoft’s Agent Governance Toolkit (MIT-licensed) provides an alternative identity mechanism using DIDs and Ed25519 keys, with behavioral trust scoring on top.
What is the OWASP MCP Top 10 and how does it relate to Kubernetes security?
The OWASP MCP Top 10 identifies the most critical security risks in MCP-enabled AI systems, including token mismanagement (MCP01), excessive permissions (MCP02), insufficient auth (MCP07), and shadow MCP servers (MCP09). Each risk maps directly to an infrastructure-layer control: certificate-bound tokens for MCP01, Agent Gateway with least-privilege IAM for MCP02, mTLS and DPoP enforcement for MCP07, and Kyverno admission policies for MCP09.
What are CNCF KARs and do they cover agent security?
KARs (Kubernetes AI Requirements) are the conformance specification for the CNCF Certified Kubernetes AI program. The v1.35 KARs include an agentic workload validation category ensuring certified platforms reliably support complex, multi-step AI agents with proper sandbox isolation, stable in-place pod resizing, and workload-aware scheduling. The program grew from 18 to 31 certified platforms as of March 2026, with automated conformance testing planned for later in 2026.
How does the Agent Gateway prevent prompt injection attacks?
The Agent Gateway integrates with Model Armor, which inspects agent traffic as it passes through the gateway and applies content filters against prompt injection and data leakage before requests reach tools or other agents. This is enforced at the network layer, so individual agents do not need to implement their own injection defenses. Model Armor is an optional component that can be enabled on a per-route basis.