To secure AI agents in a CI/CD pipeline, use the two-workflow pattern to isolate untrusted GitHub input from the privileged agent runtime, replace long-lived API keys with OIDC Workload Identity Federation, and enforce default-deny egress network policies on self-hosted Kubernetes runners.
On April 16, 2026, security researcher Aonan Guan and Johns Hopkins University researchers Zhengyu Liu and Gavin Zhong published research called “Comment and Control.” It documented three independent attacks on Claude Code, Gemini CLI, and GitHub Copilot running in GitHub Actions. All three stole credentials through GitHub itself. No external server required.
The finding was not that these agents had a bug. It was that giving an AI agent tools and secrets in the same runtime that processes untrusted GitHub data is the bug.
If you run AI agents in CI/CD pipelines, this is your architecture to defend.
What Comment and Control Proved
Three Agents, One Attack Pattern
Each of the three exploits shares the same structure. An attacker submits untrusted GitHub data containing a crafted payload - a PR title, an issue body, or an issue comment. The AI agent running in GitHub Actions processes that data as part of its normal job. The payload hijacks the agent’s reasoning. The agent executes commands and exfiltrates credentials back through GitHub, where the attacker retrieves them.
sequenceDiagram
participant A as Attacker
participant GH as GitHub
participant WF as GitHub Actions Workflow
participant AI as AI Agent
participant ENV as Environment (Secrets)
A->>GH: Open PR / file issue with injection payload
GH->>WF: Trigger workflow (pull_request / issues / issue_comment)
WF->>AI: Start agent with secrets in environment
AI->>GH: Read PR title / issue body / comment
Note over AI: Payload hijacks agent context
AI->>ENV: Access ANTHROPIC_API_KEY, GITHUB_TOKEN
AI->>GH: Post stolen credentials via PR comment / issue / commit
A->>GH: Retrieve exfiltrated secrets
The universal attack chain: untrusted GitHub data enters the agent runtime, stolen credentials exit through GitHub’s own APIs.
Each vendor variant had different mechanics:
Claude Code Security Review: A malicious PR title broke the prompt context in prompts.py. The agent inherited all environment variables via subprocess, executed injected bash commands (whoami, ps auxeww), and posted ANTHROPIC_API_KEY and GITHUB_TOKEN as a PR comment. Anthropic classified this CVSS 9.4 Critical (HackerOne #3387969, $100 bounty).
Gemini CLI Action: An issue with a fabricated “Trusted Content Section” in its body caused the agent to treat the injected section as legitimate context and post GEMINI_API_KEY publicly as an issue comment. Google paid $1,337 via VRP (#1609699).
GitHub Copilot Agent: An issue contained visible benign text plus a hidden HTML comment with the payload. The victim assigned the issue to Copilot, seeing only the innocent surface. Copilot parsed the full markdown, ran ps auxeww | base64 -w0 > running-environment-checkmark.txt, created a PR containing the file, and the attacker decoded GITHUB_TOKEN, GITHUB_COPILOT_API_TOKEN, GITHUB_PERSONAL_ACCESS_TOKEN, and COPILOT_JOB_NONCE. GitHub’s three runtime defenses were all bypassed. Base64 encoding defeated secret scanning. github.com itself was the exfiltration channel, whitelisted by the network firewall. GitHub classified this as “a known architectural limitation” and paid $500 (HackerOne #3544297).
As Guan summarized: “untrusted GitHub data flows into the AI agent, which processes it, executes commands, and exfiltrates credentials through GitHub itself.”
Why AI Agents Are Not Like Traditional CI/CD Tools
Traditional script injection in GitHub Actions requires the workflow author to use untrusted input in a shell command - something like ${{ github.event.issue.title }} directly in a run: step. The attacker needs an existing injection point the developer created by mistake.
AI agent prompt injection works differently. The agent’s job is to read GitHub data and reason about it. Processing PR titles, issue bodies, and comments is not a misconfiguration. It is the feature. The agent is the injection point. GitHub Actions workflows also auto-trigger on pull_request, issues, and issue_comment events without any victim interaction beyond receiving the PR or issue.
SecurityWeek quoted multiple security experts: the same pattern applies to “any AI agent processing untrusted input with access to tools and secrets,” including Slack bots, Jira agents, email agents, and deployment automation pipelines. GitHub is a particularly acute case because workflows trigger automatically and the exfiltration channel is the same platform the agent authenticates to.
The Six Broader Exploits: All Targeted Credentials
Comment and Control was not isolated. VentureBeat analyzed six exploit disclosures against Codex, Claude Code, Copilot, and Vertex AI over nine months. Every one followed the same pattern: the AI agent held a credential, executed an action, and authenticated to a production system without a human session anchoring the request.
Key incidents beyond Comment and Control:
- March 30, 2026 (BeyondTrust / OpenAI Codex): A crafted GitHub branch name stole Codex’s OAuth token in cleartext. OpenAI classified it Critical P1.
- Adversa / Claude Code: Claude Code silently ignored its own deny rules once a command exceeded 50 subcommands.
- Cymulate / Sandbox Escape: CVE-2026-25725 (CVSS 7.7) in Claude Code exploited conditional read-only filesystem protections applied only when files existed at startup time. Gemini CLI and Codex CLI had related configuration-layer escapes.
Anthropic’s own system card acknowledged that Claude Code’s Security Review feature “is not hardened against prompt injection.” Comment and Control was the proof case for a documented risk that was not mitigated.
VentureBeat’s survey found that only 21% of organizations have runtime visibility into what their AI agents are doing, and 88% reported AI agent security incidents in the last twelve months.
The GitHub Actions Attack Surface
How pull_request_target Amplifies the Risk
The pull_request_target trigger is the most dangerous amplifier for AI agent exploits. Unlike pull_request, which runs in the context of the fork with minimal permissions, pull_request_target runs in the context of the base repository with full access to repository secrets and write permissions.
GitHub Security Lab documented this as “pwn requests” years before AI agents existed. When an AI agent workflow uses pull_request_target and checks out the PR’s head ref, the attacker’s malicious PR title and body reach the agent with full access to GITHUB_TOKEN (with write permissions), all repository secrets, and any cloud credentials in environment variables.
This is the pattern to eliminate:
on: pull_request_target
jobs:
ai-review:
steps:
- uses: actions/checkout@v4
with:
ref: ${{ github.event.pull_request.head.sha }}
- run: claude-code review # Processes attacker-controlled PR content with secrets
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
The GitHub Actions 2026 security roadmap includes a native egress firewall for GitHub-hosted runners and secret scoping that binds credentials to explicit execution contexts. Both directly address the Comment and Control attack surface. Until those controls ship, architectural separation is the only reliable defense.
Secure Workflow Patterns: The Two-Workflow Approach
The fix is separating untrusted input handling from privileged operations. A pull_request workflow (unprivileged, no secrets) handles the event and saves sanitized artifacts. A workflow_run workflow (has secrets) processes only those artifacts - never the raw PR content.
flowchart LR
subgraph UNSAFE["Vulnerable: Single Workflow"]
direction TB
P1[PR Opened] --> T1[pull_request_target fires]
T1 --> A1[Agent runs WITH secrets]
A1 --> E1[Secrets exposed to\nuntrusted PR content]
style E1 fill:#ef4444,color:#fff
end
subgraph SAFE["Secure: Two-Workflow Pattern"]
direction TB
P2[PR Opened] --> T2[pull_request fires\nno secrets]
T2 --> S2[Save PR number to artifact]
S2 --> W2[workflow_run fires\nhas secrets]
W2 --> D2[Agent reads\nsanitized artifact only]
style D2 fill:#22c55e,color:#fff
end
Left: pull_request_target gives the agent secrets while processing attacker-controlled PR content. Right: the two-workflow pattern isolates untrusted content from privileged operations.
Implementation:
# Workflow 1: pull_request (no secrets, unprivileged)
name: validate
on: pull_request
jobs:
validate:
permissions:
contents: read
steps:
- uses: actions/checkout@v4
- run: echo "${{ github.event.number }}" > pr_number.txt
- uses: actions/upload-artifact@v4
with:
name: pr-context
path: pr_number.txt
# Workflow 2: workflow_run (has secrets, processes artifact only)
name: ai-review
on:
workflow_run:
workflows: ["validate"]
types: [completed]
jobs:
review:
permissions:
contents: read
pull-requests: write
steps:
- uses: actions/download-artifact@v4
with:
name: pr-context
run-id: ${{ github.event.workflow_run.id }}
github-token: ${{ secrets.GITHUB_TOKEN }}
- name: Run AI review on sanitized context
run: |
PR_NUMBER=$(cat pr_number.txt)
# Agent reads the PR number, not the PR title, body, or comments
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
The agent in Workflow 2 receives the PR number. It can fetch specific data through scoped API calls where you control exactly what enters its context - not the raw attacker-controlled content from the event payload.
Eliminating Long-Lived Secrets from CI/CD
How Does OIDC Workload Identity Federation Eliminate Long-Lived Secrets?
The Comment and Control exploits succeeded partly because long-lived API keys (ANTHROPIC_API_KEY, GEMINI_API_KEY) and tokens lived as environment variables in the runner. Stealing them once gives persistent access. The architectural fix is to eliminate long-lived secrets from the runner entirely.
GitHub Actions OIDC means every running job receives a signed JWT containing claims about the repository, branch, environment, and actor. Cloud providers accept this token via Workload Identity Federation and issue short-lived credentials scoped to that specific job, valid only for its duration. No cloud secrets stored as GitHub Secrets. No persistent credentials to steal.
jobs:
ai-review:
permissions:
id-token: write # Required to request the OIDC token
contents: read
steps:
- uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::123456789012:role/ci-ai-agent-readonly
aws-region: us-east-1
- name: Fetch agent API key from Secrets Manager
run: |
ANTHROPIC_API_KEY=$(aws secretsmanager get-secret-value \
--secret-id ci/anthropic-api-key \
--query SecretString --output text)
echo "::add-mask::$ANTHROPIC_API_KEY"
echo "ANTHROPIC_API_KEY=$ANTHROPIC_API_KEY" >> "$GITHUB_ENV"
- name: Run AI agent
run: claude-code review
The echo "::add-mask::" call registers the value as a secret in the Actions log, preventing it from appearing in output. The IAM role scoped to ci-ai-agent-readonly controls exactly what the AWS session can access. Even if the agent exfiltrates ANTHROPIC_API_KEY through GitHub, the value is short-lived and expires when the job ends.
For cloud credentials, the OIDC token itself is the credential - there is nothing persistent to steal. For AI vendor API keys that cannot use OIDC directly (Anthropic, Google), the pattern above provides equivalent protection: authenticate to the secrets store with OIDC, retrieve a short-lived copy at runtime, mask it in logs. Google Cloud’s Workload Identity Federation supports the same OIDC exchange via its IAM API.
Kubernetes-Native Defenses for AI Agent Pipelines
Teams running AI agent jobs on self-hosted Kubernetes runners get a defense layer that GitHub-hosted runners cannot provide. The cluster enforces controls at the infrastructure level, independent of what the agent does with its prompt. For a broader treatment of Kubernetes-native identity, gateway, and governance controls for AI agents beyond CI/CD, see Securing AI Agents at the Infrastructure Layer.
flowchart TD
subgraph CLUSTER["Kubernetes Cluster: ci-agents namespace"]
GATE["OPA Gatekeeper - Admission Control\nEnforce: NetworkPolicy label, dedicated ServiceAccount\nno hostPath mounts, no privileged containers"]
GATE --> NETPOL
subgraph NETPOL["NetworkPolicy: Default-Deny Egress\nAllow: DNS (UDP 53), internal CIDR only"]
subgraph RBAC["RBAC: Dedicated ServiceAccount\npods/log: get, list - nothing else"]
subgraph ESO["ESO: Just-in-Time Secret Injection\nShort-lived credentials from Vault / AWS SM"]
POD["AI Agent Pod\nNo long-lived secrets at rest"]
end
end
end
end
Each layer enforces independently. Gatekeeper blocks misconfigured pods at admission. NetworkPolicy restricts egress at runtime. RBAC limits Kubernetes API access. ESO ensures no long-lived secrets persist in the pod.
Default-Deny Network Policies
The Copilot exploit used github.com as the exfiltration channel because it was whitelisted by the network firewall. A default-deny egress NetworkPolicy restricts agent pods to only the endpoints they actually need, blocking exfiltration to arbitrary destinations even through allowed protocols.
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: ai-agent-egress-restrict
namespace: ci-agents
spec:
podSelector:
matchLabels:
role: ai-agent
policyTypes:
- Egress
egress:
- to:
- namespaceSelector:
matchLabels:
name: kube-system
ports:
- protocol: UDP
port: 53 # DNS resolution only
- to:
- ipBlock:
cidr: 10.0.0.0/8 # Internal services only
ports:
- protocol: TCP
port: 443
This allows DNS and HTTPS to internal services. The AI vendor API endpoint can be added as an explicit IP block if needed. GitHub’s APIs are not in the allowlist, so the Comment and Control exfiltration path does not exist for these pods.
Note that this defense requires self-hosted Kubernetes runners. GitHub-hosted runners do not support custom NetworkPolicies.
External Secrets Operator for Just-in-Time Injection
ESO pulls secrets from external stores (AWS Secrets Manager, HashiCorp Vault, Azure Key Vault) into Kubernetes Secrets at pod startup. Combined with Vault’s dynamic secrets, each pod gets a unique, short-lived credential that expires after the job completes.
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: ai-agent-api-key
namespace: ci-agents
spec:
refreshInterval: 5m
secretStoreRef:
name: vault-backend
kind: ClusterSecretStore
target:
name: ai-agent-secrets
creationPolicy: Owner
data:
- secretKey: ANTHROPIC_API_KEY
remoteRef:
key: secret/ci/anthropic
property: api_key
The Kubernetes Secret ai-agent-secrets exists only as long as the ExternalSecret is active. The pod mounts it, runs its job, and when the pod terminates, ESO can rotate the underlying credential immediately. There is no persistent secret sitting in a Kubernetes Secret waiting to be read between job runs.
RBAC Scoping for Agent Service Accounts
Agent pods should run under a dedicated ServiceAccount with permissions scoped to what the CI job actually needs. No cluster-admin, no cross-namespace secret access, no ability to modify workloads.
apiVersion: v1
kind: ServiceAccount
metadata:
name: ai-agent-sa
namespace: ci-agents
automountServiceAccountToken: false # Opt-in, not automatic
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: ai-agent-role
namespace: ci-agents
rules:
- apiGroups: [""]
resources: ["pods/log"]
verbs: ["get", "list"]
- apiGroups: [""]
resources: ["configmaps"]
verbs: ["get"]
resourceNames: ["agent-config"] # Named resource scope
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: ai-agent-binding
namespace: ci-agents
subjects:
- kind: ServiceAccount
name: ai-agent-sa
roleRef:
kind: Role
name: ai-agent-role
apiGroup: rbac.authorization.k8s.io
automountServiceAccountToken: false prevents the Kubernetes API token from being automatically mounted into the pod. If the agent does not need Kubernetes API access, it should not have the token at all.
OPA Gatekeeper Admission Policies
Gatekeeper enforces these configurations at admission time, before pods run. This turns a runtime defense (NetworkPolicy) into a pre-deployment invariant: a pod cannot reach the ci-agents namespace without the correct labels.
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sRequiredLabels
metadata:
name: require-network-policy-label
spec:
match:
kinds:
- apiGroups: [""]
kinds: ["Pod"]
namespaces: ["ci-agents"]
parameters:
labels:
- key: "network-policy-applied"
allowedRegex: "^true$"
Additional constraints for the ci-agents namespace should enforce: no hostPath volume mounts, no privileged containers, no use of the default ServiceAccount, and image pull from approved registries only. Each constraint closes one class of escape from the isolation model. Teams using Kyverno instead of Gatekeeper can apply equivalent policy-as-code admission controls for AI agent workloads using similar constraint patterns.
The Emerging Standard: Kubernetes Agent Sandbox
The defenses above are layered mitigations on top of standard container infrastructure. The Kubernetes SIG Apps Agent Sandbox project addresses the fundamental problem directly: AI agents that generate and execute code need stronger isolation than a standard Linux container provides. Standard containers share the host kernel. A compromised container process can exploit kernel vulnerabilities to escape.
The Agent Sandbox is a purpose-built CRD for running AI agents as isolated, singleton workloads. It integrates with gVisor (syscall filtering via a user-space kernel) and Kata Containers (full VM-level isolation) to provide kernel-level separation for untrusted code execution.
Key properties:
- Singleton lifecycle management: Designed for workloads that are idle between activity bursts, with support for scaling to zero and rapid state resumption.
- Stable network identity: Each Sandbox gets a stable hostname, enabling multi-agent coordination without dynamic service discovery.
- SandboxWarmPool: Pre-provisions isolated environments to eliminate cold starts. In CI/CD contexts where job latency matters, this makes VM-level isolation practical.
# Install Agent Sandbox CRDs and controller
kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/download/v0.1.0/manifest.yaml
# With SandboxWarmPool extension
kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/download/v0.1.0/extensions.yaml
The Agent Sandbox is in active development. For teams running AI agent workloads on Kubernetes today, gVisor and Kata Containers can be configured via existing RuntimeClass resources without waiting for the Sandbox CRD to reach stability.
What Changes When AI Agents Enter Your Pipeline
Traditional CI/CD Security vs. AI Agent CI/CD Security
The existing CI/CD security model was built around scripts executing deterministic commands. AI agents change several foundational assumptions.
| Control | Traditional CI/CD | AI Agent CI/CD |
|---|---|---|
| Injection vector | Unsanitized shell variables | Natural language input (PR title, issue body, comment) |
| Scope of impact | One run: step, if misconfigured | Any tool the agent has access to |
| Secret exposure | Secrets in env vars, used by explicit commands | Agent holds credentials and decides autonomously when to use them |
| Execution trigger | Explicit run: commands | Agent generates and executes commands |
| Exfiltration path | External endpoints, blocked by firewall | Whitelisted channels (github.com, issue comments, git commits) |
| Remediation | Sanitize the variable | Separate untrusted input from the privileged runtime |
| IAM model | Human developer as principal | Agent as autonomous principal with its own credentials |
The IAM model shift is the one that breaks existing security tooling. Traditional IAM tracks what a human developer can do. In Comment and Control, the agent is the principal, holding credentials and making API calls. VentureBeat found that only 21.9% of teams have enrolled agent OAuth credentials into a PAM platform. The rest have agent identities their IAM posture is blind to.
The Vendor System Card Gap
Anthropic’s system card for Claude Code explicitly acknowledged that the Security Review feature “is not hardened against prompt injection.” Comment and Control was the exploit that the system card predicted.
No vendor currently provides a quantified injection resistance rate for specific model versions on specific deployment platforms. The practical implication: you cannot solve this by switching vendors. The risk is systemic to the design of agentic systems that combine untrusted input with powerful tool access. The architecture must assume the agent can be hijacked and limit what it can do when hijacked.
The lack of formal CVEs for all three Comment and Control vulnerabilities compounds the problem for security teams. Anthropic issued a HackerOne report with a CVSS score. Google paid a VRP bounty. GitHub called it a known architectural limitation. None published a security advisory or requested a CVE. Without CVEs, vulnerability scanners cannot flag exposure and security teams have no artifact to track remediation across their toolchain.
Frequently Asked Questions
Can prompt injection in a GitHub comment really steal my CI/CD secrets?
Yes. The Comment and Control research demonstrated that PR titles, issue bodies, and issue comments can hijack Claude Code, Gemini CLI, and GitHub Copilot running in GitHub Actions. The agents exfiltrated API keys and tokens (ANTHROPIC_API_KEY, GEMINI_API_KEY, GITHUB_TOKEN) back through GitHub itself, requiring no external infrastructure. The exfiltration channel was GitHub’s own APIs - PR comments, issue comments, and git commits - all whitelisted by standard network controls.
What is the difference between traditional script injection and AI agent prompt injection in CI/CD?
Traditional script injection requires the workflow author to use untrusted input in a shell command - for example, run: echo ${{ github.event.issue.title }}. The attacker needs an injection point that a developer created by mistake. AI agent prompt injection exploits the agent’s normal operation of reading and reasoning about GitHub data. The agent is the injection point, not a misconfigured shell command. There is no developer mistake to find and fix; processing PR titles and issue bodies is what the agent is there to do.
Should I stop using AI agents in my CI/CD pipeline?
No, but isolate them architecturally. Use the two-workflow pattern to prevent untrusted GitHub content from reaching the agent’s privileged runtime. Replace long-lived API keys with OIDC Workload Identity Federation and runtime injection from a secrets manager. On Kubernetes self-hosted runners, apply default-deny egress NetworkPolicies, External Secrets Operator for just-in-time credential injection, and dedicated RBAC with minimal permissions. The risk is not the AI agent itself - it is giving the agent tools and secrets in the same runtime that processes untrusted input.
Why did none of the Comment and Control vulnerabilities get CVEs?
As of the April 16, 2026 disclosure, no CVEs were issued and no security advisories were published through GitHub Security Advisories. Anthropic classified the Claude Code finding as CVSS 9.4 Critical via HackerOne. Google paid a VRP bounty. GitHub characterized Copilot’s issue as a known architectural limitation. Without a CVE, vulnerability scanners cannot flag exposure and security teams have no artifact to track remediation. As TheNextWeb noted, this reflects a governance gap in the AI security ecosystem: there is no established framework for disclosing prompt injection vulnerabilities the way traditional software CVEs are handled.
What is the Kubernetes Agent Sandbox and how does it help?
The Agent Sandbox is a SIG Apps project providing a purpose-built Kubernetes CRD for running AI agents as isolated, singleton workloads. It integrates with gVisor (syscall filtering) and Kata Containers (VM-level isolation) to give stronger protection than standard Linux containers, which share the host kernel and are vulnerable to kernel-level escapes. The SandboxWarmPool extension pre-provisions isolated environments to eliminate cold starts, making VM-level isolation practical for CI/CD workflows where job latency matters. The project is currently in active development.