← All posts Engineering

Securing Coding Agents in CI/CD: Practical Baseline

If you run CI/CD for a team that is adopting coding agents, the answer is simple: securing coding agents in CI/CD means assuming prompt injection will happen and designing the workflow so it cannot do much damage. Do not give an agent secrets, write permissions, broad network access, and untrusted GitHub content in the same job.

The risky pattern is now clear. An agent reads an issue, pull request body, PR comment, commit message, or forked code. That content becomes part of the agent's instructions. If the workflow also has repository write access, CI secrets, package credentials, cloud credentials, or free outbound network access, the agent can become a bridge between attacker-controlled text and privileged automation.

This is a CI/CD architecture problem, not only a model behavior problem. The practical fix is to split trust zones, reduce permissions, control egress, isolate runtime access, use short-lived credentials, and require human approval for high-impact actions.

Who needs to care first

This is for platform engineers, DevSecOps teams, security reviewers, and engineering leaders who are letting agents comment on PRs, edit code, run tests, open branches, use MCP tools, or interact with GitHub APIs. If your agent only reads a local repository with no secrets and no write token, your risk is lower. If your agent handles external contributions or public issue content, treat the workflow as high-risk.

The core question is not "Which agent is safe?" It is "What can this workflow do when the agent follows a malicious instruction?"

Why securing coding agents in CI/CD is different from securing a normal Action

A normal CI job executes code and commands defined by maintainers. That is already sensitive. It can access source code, build artifacts, package registries, cloud systems, release keys, and repository tokens.

A coding agent changes the control path. Natural language becomes operational input. An issue comment can influence a shell command. A PR description can shape a file edit. A malicious instruction hidden in fork code can ask the agent to inspect environment variables, alter workflow files, or post sensitive output somewhere else.

That is why normal CI hardening is necessary but not enough. You still need pinned actions, least privilege, branch protection, and secure secret handling. On top of that, you need controls for prompt injection, nondeterministic tool use, excessive agency, and untrusted text being treated as instructions.

The failure mode to design against

The highest-risk combination is:

  • The agent reads untrusted GitHub content, such as fork code, PR bodies, issue text, comments, or commit messages.
  • The workflow has secrets or a write-capable token.
  • The agent can run commands, edit files, call APIs, or use external tools.
  • The job has broad filesystem access or outbound network access.
  • There is no human approval before merge, publish, deploy, or privileged workflow execution.

This is the agentic version of the classic GitHub Actions "pwn request" problem. With pull_request_target, workflows run in the context of the base repository. That can include the base repository's GITHUB_TOKEN, secrets, and cache access. If the workflow checks out and executes unreviewed fork code in that context, attacker-controlled code can run with trusted privileges.

Agentic workflows widen the same class of mistake. The malicious payload may be text instead of code, and the agent may have tools that can create branches, comment on PRs, call MCP servers, read repo metadata, install packages, or make network requests.

Start with the GitHub Actions baseline

Before adding agent-specific controls, fix the Actions foundation.

Use read-only defaults

Set default GITHUB_TOKEN permissions to read-only. Escalate permissions per job only when a job needs them. Do not let convenience decide the permission profile.

Split untrusted and trusted workflows

For forked PRs and other untrusted inputs, use a low-privilege pull_request workflow with no secrets and read-only permissions. If you need a privileged follow-up, pass artifacts to a separate workflow_run job that does not execute untrusted code.

This pattern lets you inspect or test untrusted code without handing it the keys to the base repository.

Treat pull_request_target as privileged

Do not use pull_request_target to check out and execute unreviewed fork code. GitHub introduced safer defaults in actions/checkout@v7 to refuse common pwn-request checkout patterns, with broader enforcement planned for July 16, 2026. That helps, but it does not cover every way to fetch or execute untrusted code.

Protect workflow files

Require CODEOWNERS or security approval for changes under .github/workflows. Block or quarantine agent-authored workflow changes until a human reviews them. A malicious or careless workflow edit can move code from an untrusted context into a trusted one.

Provider defaults are not the same

Do not build one internal policy around the most permissive or most restrictive vendor default. Copilot, Codex, Claude, and internal agents expose different controls. Your policy should define the minimum bar across all of them.

GitHub Copilot cloud agent

Copilot cloud agent runs in a GitHub Actions-powered environment. It can research repositories, create plans, make changes, create branches, and optionally open PRs.

GitHub limits Copilot's internet access with a firewall by default. That matters because malicious instructions or unexpected behavior could leak code or sensitive data. The caveat is important: the firewall applies to agent-started Bash processes, not necessarily to MCP servers or configured setup steps, and advanced attacks may bypass it.

GitHub also keeps Actions workflows from running automatically by default when Copilot pushes changes to a PR. Skipping that approval can let unreviewed Copilot code access repository write permissions or Actions secrets. Keep that approval gate unless you have a stronger compensating control.

Use Copilot's dedicated Agents secrets and variables instead of mixing agent credentials with general Actions secrets. Scope organization-level secrets to the repositories that need them.

OpenAI Codex GitHub Action

The Codex GitHub Action runs Codex inside CI/CD jobs. It can apply patches or post reviews, and it runs codex exec under configured permissions.

Codex provides sandbox modes such as read-only, workspace-write, and danger-full-access. It also documents a default safety-strategy: drop-sudo and an unprivileged-user mode. Use the least capable mode that supports the job. For review-only work, start with read-only. For patch generation, prefer workspace write with tight permissions.

Codex defaults network access off for the agent phase. Setup can still access the network before the agent phase, and secrets are removed before the agent phase. That split is useful, but it only helps if you avoid reintroducing broad credentials or network access later in the job.

Claude Code actions

Anthropic's Claude Code Base Action states that it is a thin wrapper and does not enforce trust boundaries on its own. That puts the burden on the workflow author to ensure the working directory and prompt are trusted.

Anthropic's security guidance also warns against bypassing its main permission mechanism through allowed_non_write_users. It recommends job-scoped GITHUB_TOKEN use instead of personal access tokens, because static tokens do not rotate between runs and may be recoverable over time through prompt injection.

Recent research from Microsoft and GMO Flatt showed real Claude Code GitHub Action attack paths involving untrusted GitHub content and privileged workflows. Anthropic shipped fixes for specific issues, but the operational lesson remains: do not treat an agent action as a security boundary.

Control prompt injection by limiting consequences

You cannot reliably prevent every hostile instruction from reaching the model. In CI/CD, the safer design is to assume prompt injection will occur and make sure the agent cannot complete the damaging part.

Use these limits:

  • No secrets in jobs that read untrusted issue, PR, comment, or fork content.
  • No write token unless the job truly needs to write.
  • No automatic workflow execution for agent-authored code until a human approves it.
  • No broad outbound network access by default.
  • No personal access tokens for agent workflows.
  • No unreviewed edits to CI configuration, release scripts, dependency manifests, or security-sensitive policy files.

OWASP's excessive agency framing maps cleanly here: excessive functionality, excessive permissions, and excessive autonomy are the root causes. A useful agent can run tests, install dependencies, call APIs, edit files, and summarize results. A safe agent gets only the subset needed for the task.

Secrets need their own policy

Secrets redaction is not a security boundary. Transformed secrets may not be masked. Logs, tool output, generated comments, artifacts, and model-visible context can all become leak paths.

Use short-lived credentials wherever possible. Prefer OIDC for cloud access instead of static cloud keys. Prefer job-scoped GITHUB_TOKEN permissions instead of personal access tokens. Keep agent secrets separate from normal Actions, Codespaces, or Dependabot secrets when the platform supports that separation.

Private dependency access is the hard case. Agents often need package registry credentials to install and test. The safer pattern is to put install steps in a controlled setup phase, avoid exposing reusable credentials to the agent phase, and scope registry tokens to read-only package access.

Network egress should be default deny

Outbound access turns a local compromise into data exfiltration. Default-deny egress is one of the strongest controls for coding agents in CI/CD.

Allowlist only the domains needed for the job: package registries, source hosts, artifact stores, model APIs, and approved internal services. Keep a separate allowlist for setup steps, agent execution, and MCP tools if your platform supports it. GitHub's Copilot firewall documentation is explicit that setup steps and MCP servers have caveats, so do not assume one firewall setting covers every process.

Also make allowlist debugging practical. If dependency installs fail because a package host is blocked, developers will ask to disable the firewall. Give them logs that show blocked destinations and a review path for adding domains.

Runtime isolation and observability

Sandboxing, containers, unprivileged users, sudo removal, filesystem limits, and ephemeral runners reduce blast radius. They are not replacements for least privilege, but they make mistakes less expensive.

For observability, collect enough detail to answer what happened after an incident. At minimum, log:

  • Prompt inputs and model outputs that affected actions.
  • Tool calls and shell commands.
  • Files read and changed.
  • Network destinations and blocked egress attempts.
  • GitHub API calls, comments, labels, branches, and PRs created by the agent.
  • OIDC token issuance and credential use.
  • Final diffs and artifacts.

Static workflow scanning is still useful. It can catch broad permissions, dangerous pull_request_target patterns, unpinned actions, and secrets in untrusted contexts. But agent behavior is dynamic, so runtime telemetry matters for detection and response.

A practical baseline checklist

Use this as the minimum operating profile for agentic CI/CD:

  • Classify each agent workflow as trusted or untrusted based on the content it reads.
  • Run untrusted PR code only with read-only permissions and no secrets.
  • Separate untrusted analysis from privileged follow-up with a two-workflow pattern.
  • Set default GITHUB_TOKEN permissions to read-only.
  • Escalate permissions per job, not per repository.
  • Use job-scoped tokens and OIDC instead of static personal access tokens.
  • Keep agent secrets separate and narrowly scoped.
  • Disable or tightly allowlist network egress during the agent phase.
  • Review MCP servers as privileged tools, not harmless plugins.
  • Use sandboxing, unprivileged users, and sudo removal where available.
  • Require approval before agent-authored code runs privileged workflows.
  • Require review for changes to .github/workflows and release automation.
  • Log prompts, tool calls, commands, file diffs, API calls, network destinations, and credential events.
  • Document a revocation playbook for secrets exposed through agent workflows.

The operating rule

The safest mental model is straightforward: untrusted GitHub content plus a privileged agent workflow is a security incident waiting to happen. The agent may be helpful, but it is still operating inside CI/CD, where secrets, write permissions, and release paths live.

Secure the workflow so that a malicious instruction can waste a run, fail a check, or produce a bad patch, but cannot steal secrets, publish artifacts, rewrite CI policy, or push trusted code without review. That is the bar for securing coding agents in CI/CD.

Get started

Deploy your fleet.

Put a fleet of sandboxed agents to work on your own infrastructure, provisioned in seconds and watched live from one console.

Get started

Admin-provisioned · Self-host in one command · Your data never leaves your VM