MCP Security: Least Privilege, Write Gates, and Keeping Humans in the Loop

A coding agent connected over MCP is not a chatbot with extra context. It is a process holding a key to your systems, deciding on its own when to turn it. MCP security comes down to three lines: keep that key small, keep most of what the agent touches read-only, and make anything that writes, deletes, or sends wait for a person to say go. Hold those three and the protocol is safe to live with. Skip them and you have handed an autonomous program standing write access to production.

The Model Context Protocol has become the default way agents like Claude Code and Cursor reach a team's tools and data. That reach is the whole point, and it is also the whole risk. This guide covers what actually changes your exposure: least privilege, the difference between resources and tools, the human gate on writes, and the auth scoping that holds it together. If you want the protocol basics first, what MCP is for teams covers the wire; this is about not getting burned by it.

Start from least privilege, not from "what's convenient"

The fastest way to get an agent working is to give it broad access and move on. It is also the fastest way to turn a stolen token or an injected instruction into a real incident.

The official MCP security guidance is blunt about this. Publishing every scope in scopes_supported and letting the client request them all up front means one leaked token carries files:*, db:*, maybe admin:*. The recommended pattern is the opposite: a minimal initial scope set with only low-risk read and discovery operations, then step-up elevation through targeted WWW-Authenticate challenges when a privileged operation is first attempted. No wildcard scopes. No bundling unrelated privileges to save a future prompt.

In practice this means the agent should read only the surfaces you connect, and it should inherit each tool's existing permissions rather than getting its own broader grant. If a teammate cannot see a private Linear project, the agent acting on their behalf should not see it either. This is the stance Ody takes by default: it reads only what you connect, honors each tool's existing permissions, and writes back nothing on its own. The specifics are on the security page. Least privilege is not a hardening step you do later. It is the shape of the integration from the first connection.

Resources are read-only. Tools can act. Treat them differently.

MCP gives a server two ways to expose itself, and the security difference between them is the most useful distinction in the whole protocol.

Resources Tools
Controlled by The application (host decides when to load) The model (agent decides when to invoke)
Can change state No - read-only Yes - writes, deletes, outbound calls
Identified by A URI (file:///, schema://main) A name plus input schema
Risk profile Data exposure Data exposure plus action

A resource is a read-only data source. Schemas, docs, configuration, a decision log. The host application pulls it into context when the user or app asks. It cannot do anything; it can only be read. A tool is a model-controlled function that runs when the agent decides it is needed, and it can have side effects: create a file, query a database, send a message.

This maps cleanly onto your threat model. Anything the agent only needs to read should be a resource, not a tool. The moment you wrap reference data in a callable tool, you have moved it from "the app decides when this loads" to "the model decides when this runs," and you have handed the agent a lever it did not need.

For the tools you do expose, MCP supports annotations the client uses to build guardrails: readOnlyHint for tools that only read, destructiveHint for tools that modify or delete, and openWorldHint for tools that touch external systems. Set them honestly. A client that respects annotations can auto-allow a read-only lookup while forcing a confirmation dialog on anything destructive. These are hints, not enforcement, so the server must still gate dangerous actions itself. Lying in your annotations, or leaving them off, throws the protection away and gives the user nothing to trust.

Keep a human in the loop for every write

This is the line that matters most, and it is the one MCP does not fully draw for you.

The spec expects clients to keep a human able to deny tool invocations, and the annotations exist precisely so a client can flag actions that need confirmation. But the protocol ships no finished, human-in-the-loop workflow for high-risk actions. Database deletions, bulk record changes, external data transfers, outbound messages: a mistaken or injected instruction on any of these can do damage that is hard or impossible to reverse, and the protocol leaves the gate for you to build. Connect an agent over MCP, let it write without a checkpoint, and you have skipped the single most important control.

This is where Ody's design is opinionated by construction rather than by configuration. Ody senses continuously and automatically, but a nudge is the ceiling of its autonomy. It will tell you which step is blocking a workstream, flag that a promise made in a Slack thread is about to slip, point out a decision that contradicts last month's. It does not act on any of that on its own. No silent overwrites; the human starts every write. That is the same boundary the MCP security guidance asks teams to enforce, made structural instead of optional. We wrote more on why an agent should sense without acting in why agents repeat mistakes.

A human gate is not friction you tolerate. It is the reason you can connect an agent to real systems at all.

Auth and scoping: where most MCP breaches actually live

The exotic-sounding MCP attacks mostly reduce to tokens granted too broadly, validated too loosely, or stored too carelessly. A few that recur in the current guidance, worth knowing by name:

  • Token passthrough. The spec forbids an MCP server accepting a token that was not issued to it and forwarding it downstream. Doing so circumvents rate limiting and validation, poisons audit trails, and lets a stolen token turn the server into an exfiltration proxy. Servers must reject any token not issued for them.
  • Confused deputy. An OAuth proxy using a static client ID plus dynamic client registration and a consent cookie can be tricked into skipping the consent screen, handing an authorization code to an attacker's redirect URI. The fix is per-client consent, stored and checked before forwarding to the third-party authorization server, plus strict exact-match redirect URI validation.
  • Tool poisoning. A 2025 study of 1,899 open-source MCP servers found about 5.5% had tool-poisoning issues: deceptive tool descriptions, injected responses, or data redirected to unauthorized endpoints. Run servers you trust, pin versions, and review what a server's tools actually claim to do.
  • SSRF and local server compromise. Untrusted local servers run with your client's privileges and can execute arbitrary commands; malicious metadata URLs can point a client at 169.254.169.254 to lift cloud credentials. Sandbox local servers, require consent before executing startup commands, and block private IP ranges during OAuth discovery.

The defensive posture underneath all of these is the same: short-lived, narrowly scoped tokens with rotation; strict audience validation so a token only works where it was meant to; an audit trail that logs every call with its source. Ody runs in the EU with per-organization isolation, role-based access, secrets encrypted with AES-256-GCM at rest, and an audit log of every capture, answer, and nudge with sources attached. The principle that ties the team knowledge graph together is that the agent never widens a permission it was given.

MCP security in three lines

It is not a long checklist if you hold three lines. Give the agent the least access that lets it work, and widen scope only when a real operation demands it. Keep everything you can as a read-only resource, and reserve callable tools for the few things that genuinely need to act. And gate every write behind a human, because the protocol will not do it for you. An agent that senses widely and acts only on a person's say-so is the safe shape. Everything else is detail.

If you want to see an MCP-callable knowledge layer that reads broadly and writes nothing on its own, book a demo or join the waitlist.

Common questions

What is the biggest security risk in MCP deployments?

Over-scoped credentials combined with unsupervised write tools. Most teams hand an MCP server a broad API token because it is faster than designing narrow permissions. Then a prompt injection or a model misinterpretation turns that token into a write action nobody authorized. Scope tokens to the minimum required by each tool's declared purpose, and gate write operations behind explicit human confirmation rather than letting the model call them autonomously.

What is the difference between MCP resources and MCP tools from a security standpoint?

Resources are application-controlled: the host decides what data to load into context. The model reads it but cannot trigger it on its own. Tools are model-controlled: the LLM decides when to call them, which means it can call them without a human in the loop. Read access to a runbook or decision record is a resource. Sending a Slack message or merging a branch is a tool. Treat them differently: resources can be broad; tools should be narrow, annotated honestly, and gated on writes.

How should MCP authentication be structured for agent workloads?

Use OAuth 2.1 with PKCE for remote MCP servers. Issue short-lived access tokens (5 to 60 minutes) and rotate refresh tokens on every use. Add resource indicators (RFC 8707) so each token is valid only for its intended audience. Never pass through a user OAuth token from the host directly to the server - exchange it so the downstream token is scoped to the specific operation and never broader than what the delegating user is allowed to do.

Is tool poisoning a real threat in MCP, and how do you defend against it?

Yes. A 2025 study of 1,899 open-source MCP servers found about 5.5% had tool-poisoning issues: deceptive tool descriptions, injected responses, or data redirected to unauthorized endpoints. Defenses are practical: only connect servers you control or have audited; pin server versions; review tool descriptions before deployment; and apply a human-approval gate on any tool that writes, sends, or deletes.