← Back to Blog

How to Use AI Safely and Responsibly

Luiz Martins
Matthias Oertel
Luiz Martins, Matthias Oertel
May 19, 2026

AI coding assistants now sit at the same keyboard you do. They can read your filesystem, install dependencies, push to remotes, hit production APIs, and execute whatever shell command you let them. They run with your permissions — which means anything you can do, they can do, including things you would never deliberately do.

This post is the security baseline we've settled on at Prisma. It's organized by risk: what can go wrong, what a real failure looks like, and the smallest change that prevents it. The recommendations apply to Claude Code, Cursor, Copilot, Windsurf, Zed, and any other agent that operates with your local permissions.

If a section doesn't apply to your setup, skip it. If you only have time for one thing, jump to the quick reference at the end.

AI agent security overview

MCP servers and token security

MCP (Model Context Protocol) servers connect AI assistants to external services — Slack, databases, GitHub, cloud APIs. The tokens you configure grant the AI the same access as the token holder.

What to do

  • Default to read-only tokens. Most agent work doesn't need write access.
  • When write access is needed, use short-lived tokens — never permanent RW credentials. Revoke as soon as the task is done.
  • Create dedicated service accounts with minimal scopes. Don't reuse your personal admin token.
  • Avoid long-lived tokens in general. Rotate frequently; prefer tokens with automatic expiry.
  • Audit your MCP servers quarterly — which ones are connected, and what scopes their tokens carry.
  • Prefer OAuth with limited scopes and automatic rotation, where the service supports it.
MCP serverRecommended scopeAvoid
GitHubrepo:read, issues:readadmin:org, delete_repo, repo (full)
DatabaseRead-only connection stringAdmin/write credentials to production
Cloud APIs (AWS, GCP, Vultr, etc.)Scoped IAM role with read-only policiesAdmin or *:* policies
Linear / Notion / JiraRead + comment accessWorkspace admin tokens

Secrets management

.env files, environment variables, and config files containing secrets are fully readable by AI agents.

What to do

  • Use read-only API tokens in local .env files — never admin or read-write credentials.
  • Add .env to .gitignore and to agent ignore files (.claude/settings.json deny patterns, .cursorignore, .github/copilot-ignore).
  • Don't store production credentials in plaintext files agents can read.
  • Use a secret manager — Doppler CLI, 1Password CLI, or similar — instead of plaintext .env files.
  • Use short-lived tokens for write access and revoke them immediately after.
  • Separate local-dev credentials from production ones.

Here's what swapping .env for a secret manager looks like in practice, using Doppler as an example. The same pattern works for 1Password CLI:

brew install doppler-cli # macOS
doppler login
doppler setup --project your_project_name --config your_config_name

Then instruct your agent (e.g. via AGENTS.md) to launch your app through the secret manager rather than reading .env:

doppler run -- npm start

To stop agents reading secret files even when they're around, add deny patterns to .claude/settings.json:

{
  "deny": [
    "Read(.env*)",
    "Read(.ssh*)",
    "Read(*.pem)",
    "Read(*credentials*)",
    "Read(*secret*)",
    "Read(~/.claude/settings.json)"
  ]
}

Cursor users can do the same with a .cursorignore file in the project root:

.env
.env.*
.ssh/
*.pem
*.key
**/credentials/**
**/secrets/**

Filesystem isolation

Agents have full read/write access to your filesystem by default. They can read SSH keys, cloud credentials, browser data, and any file your user account can access.

Isolation options

MethodPlatformHow it worksEffort
Claude Code /sandbox (Seatbelt)macOSNative OS-level sandbox. Restricts filesystem writes to the project directory and controls network access via a domain-level proxy. Run /sandbox in Claude Code to enable. Works out of the box.Low
Claude Code /sandbox (bubblewrap)Linux / WSL2Namespace-based sandbox using bubblewrap + socat. Same filesystem and network isolation as macOS. Requires apt install bubblewrap socat.Low
DockerAll platformsRun the agent inside a container with only the project directory volume-mounted.Medium
Claude Code on the webAny (browser)Runs in a fully isolated remote sandbox — no access to local files or credentials.Low

If you'd rather isolate with Docker, mount only the project directory and cut the network:

docker container run \
  --rm \
  --interactive \
  --tty \
  --volume "$(pwd)":/workspace \
  --workdir /workspace \
  --network none \
  your-dev-image \
  claude

--network none blocks outbound calls from the container. Drop it only if the task actually requires network access.

On Linux, bubblewrap gives you a more lightweight equivalent:

bwrap \
  --ro-bind / / \
  --bind "$(pwd)" "$(pwd)" \
  --dev /dev \
  --proc /proc \
  --tmpfs /tmp \
  --unshare-all \
  --die-with-parent \
  claude --dangerously-skip-permissions

This mounts the entire filesystem read-only except for the project directory. --dangerously-skip-permissions is safe here because the sandbox prevents destructive actions outside the project.

Sandboxed agent architecture

Agent execution risks on macOS

macOS doesn't have a native equivalent to Linux's bubblewrap or namespaces. Agents run shell commands with your full user permissions — access to ~/.ssh, ~/.aws, Keychain, and every user-writable file on the machine.

What to do

  • Don't use --dangerously-skip-permissions on macOS outside a Docker container or VM.
  • Review every shell command before approving execution.
  • Allowlist specific commands in Claude Code's permission system and deny dangerous patterns.
  • Use Docker isolation (above) or run /sandbox in Claude Code.
  • For untrusted code, use Claude Code on the web — it runs in a fully isolated remote sandbox.

Safe long-running tasks and permission management

It's tempting to skip permission prompts for convenience. That's how agents end up running destructive commands without review.

What to do

  • Allowlist specific, scoped commands — never blanket patterns.
  • Don't allowlist destructive commands like rm, git push, or curl with wildcards.
  • For long-running autonomous tasks, sandbox first, then use --dangerously-skip-permissions inside the sandbox only. That gives the agent autonomy while keeping the blast radius contained.

Safe vs. unsafe allowlist patterns

Safe patternsUnsafe patterns — never use
bash(npm test)bash(*)
bash(go build ./...)bash(rm *)
bash(git diff *)bash(git push *)
bash(cat *)bash(curl *)
bash(pytest *)bash(docker *)
bash(eslint *)bash(sudo *)

Safely using skills and MCP extensions

Skills and MCP servers are third-party code that runs on your machine with your permissions. A malicious or poorly written extension can read files, make network requests, or execute commands without your knowledge.

Before you install, check

  • The source. Is it from a trusted organization or verified open-source project?
  • The code. Any suspicious file operations, network calls to unknown domains, or obfuscated logic?
  • The permissions. Do you understand what system access it requires, and why?
  • The URL. Watch for typosquatting (e.g. claud3-code vs. claude-code, mcpp-server vs. mcp-server).
  • The reputation. GitHub stars, issues, and commit history that look legitimate.
  • The version. Are you installing a specific version, not latest or main?

Red flags — do not install

What to do

  • Audit the code before installing.
  • Pin versions — don't auto-update without reviewing changelogs.
  • Skip skills from untrusted or unverified sources. When in doubt, don't install it.
  • Run unfamiliar skills in a sandbox before letting them touch real credentials.
  • Ask someone with security or infrastructure experience when an extension feels off and you can't articulate why.

Prompt injection attacks

Malicious instructions can hide in code comments, README files, issue descriptions, PR bodies, or API responses. When the agent reads them, it can be tricked into executing actions you never asked for.

Prompt injection can hide in any source of text the agent reads — code comments and docstrings, markdown files (README, CONTRIBUTING, CHANGELOG), git commit messages, issue and PR descriptions (especially from external contributors), API responses and webhook payloads, database content rendered in templates, and dependency package.json descriptions or post-install scripts.

What to do

  • Don't auto-execute commands found in untrusted files — READMEs, issues, PRs from external contributors.
  • Review agent actions carefully when working with external or untrusted codebases.
  • Use filesystem isolation when exploring unfamiliar repositories.
  • Be cautious with unknown contributors, forks, and recently created repos.
  • Cut network access (--network none in Docker) when exploring untrusted code.

Validating AI-generated code

AI-generated code can contain vulnerabilities, backdoors, or data exfiltration — whether from model mistakes, training-data contamination, or prompt injection that you didn't catch.

What to watch for

  • Unexpected outbound network calls — especially POST requests to unknown URLs.
  • Base64-encoded strings or obfuscated code segments.
  • eval(), exec(), Function(), or dynamic code execution.
  • Hardcoded URLs or IP addresses you don't recognize.
  • Overly broad file operations (/**, /tmp, home-directory access).
  • Dependencies added that weren't part of the task requirements.

What to do

  • Review every line before committing. Treat AI-generated code as untrusted input.
  • Apply the same review standards as you would to a human contributor's code.
  • Run static analysis — Semgrep, eslint-plugin-security, gosec, Bandit.
  • Run tests in CI, not just locally.
  • Use pre-commit hooks to catch common security issues automatically.

Quick reference card

CategoryDoDon't
MCP tokensRead-only tokens, dedicated service accounts, minimal scopesPersonal admin tokens, broad scopes, shared credentials
SecretsSecret managers, agent ignore files, short-lived tokensPlaintext .env with prod creds, reused passwords
FilesystemDocker / bwrap isolation, project-only mountFull filesystem access, no sandboxing
macOSDocker isolation, review every command, permission allowlists--dangerously-skip-permissions without a sandbox
PermissionsScoped allowlists: bash(npm test), bash(git diff *)bash(*), bash(rm *), bash(curl *)
Skills / MCPAudit code, pin versions, verify sourceAuto-update, install from unknown sources
Prompt injectionSandbox untrusted repos, review agent actionsAuto-execute commands from external files
Generated codeReview all output, static analysis, CI testingCommit without review, skip code-review gates

Why this matters

These numbers describe the wider landscape the agent is operating in:

  • 12.8 million hardcoded secrets detected in public GitHub commits in 2023 (GitGuardian State of Secrets Sprawl 2024).
  • 742% increase in software supply chain attacks from 2019–2022 (Sonatype State of the Software Supply Chain).
  • 1 in 8 open-source downloads contains a known vulnerability (Snyk State of Open Source Security).
  • 80%+ of codebases contain at least one known open-source vulnerability (Synopsys OSSRA Report).
  • $4.88M average cost of a data breach in 2024 (IBM Cost of a Data Breach Report).
  • AI coding assistants can generate insecure code up to 40% of the time when given no explicit security guidance (Stanford / NYU research on Copilot security).

The numbers are from published reports on traditional codebases. The risk surface when an agent also has filesystem and command-execution access is substantially larger.

Share this article