MCP Engineering Deep Dive 02: Security, Auth, and Public Server Hardening Are Product Boundaries, Not Patches

Subtitle: If your MCP server is public, can write data, or touches real users, security cannot be the last middleware you add. It has to shape your tool surface, auth flow, authorization scope, and response discipline from the beginning.

When teams first expose an MCP server to the public internet, the security checklist in their heads often looks like this:

turn on HTTPS
add an API key
reject unauthenticated requests
add rate limiting later
revisit OAuth if needed

That sequence is understandable. It is also how many servers end up fragile.

Because the security problem of a public MCP server does not live in one layer.
It lives in all of these at once:

which tools the model is even allowed to see
which tools are helpers and which ones deserve public exposure
whether authentication proves identity
whether authorization limits action
whether your edge and proxy preserve your assumptions
whether your error messages expose more internal structure than you intended

In other words, security is not a patch. It is the boundary design of a public MCP server.

Security starts at the tool surface

Start with the conclusion: for a public MCP server, “ship first, harden later” is the wrong rhythm

The official MCP security guidance does not frame security as a thin auth tutorial. It explicitly discusses prompt injection, confused deputy risks, token passthrough, tool safety, least privilege, and user consent. OpenAI’s ChatGPT developer mode documentation is equally blunt: developer mode offers full MCP client support, but it is powerful and dangerous, and developers need to account for prompt injection, model mistakes, and malicious MCP servers. FastMCP, meanwhile, has moved authentication, authorization, middleware, and component-level access control squarely into the production conversation.

Taken together, the message is clear:

A public MCP server is not secure just because it has auth. It is safer only when its visible surface, permissions, tool behavior, and response boundaries were designed deliberately.

Layer one: shrink the public surface before you add more controls

This is the easiest part to miss.

Teams often expose every callable function as a tool because they want the host to “see the full capability set”. But that instinct creates a practical problem:

Anything the model can discover is already part of your attack surface.

If a capability only exists to glue server-side flows together, transform data, or resolve internal references, it does not automatically belong in the public tool list.

The more tools you expose:

the more options the host has to consider
the more chances the model has to pick the wrong one
the larger your permission surface becomes
the harder your authz policy gets to reason about

The MCP spec also reminds clients to treat tool annotations as untrusted unless they come from trusted servers. Push that lesson one step further and you get a practical rule: do not design your public tool surface around hope.

My working rule

If a capability is any of the following, I usually hesitate to expose it publicly:

a helper used only by server-side flows
an ugly internal transform with poor standalone semantics
a tool that needs too much hidden context to be safe
a high-side-effect tool without mature authz boundaries
a capability that is easy to misuse through prompt injection

A public tool list is not a feature checklist.
It is closer to the minimum capability surface you are willing to let a model discover, reason about, and attempt to call.

Authentication and authorization are different jobs

Layer two: authentication and authorization are different jobs

This is another classic trap.

Many projects say:

We already have tokens.
So we should be fine.

That sentence only proves one thing: you might know who is making the request.
It says nothing about what they should be allowed to do.

Authentication answers:

who is this requester?
is the token valid?
do issuer, audience, and expiry make sense?

Authorization answers:

may this actor list this tool?
may they read this resource?
may they execute this action?
may they trigger writes, deletes, or costly operations?

FastMCP 3.x makes this distinction quite explicit. You can apply server-wide policy, component-level checks, and even filter list responses before execution happens. That matters because a healthy public server should not wait until execution time to reveal that a tool should never have been visible in the first place.

A practical way to think about it

Authentication decides whether someone gets through the door
Authorization decides where they may go once inside
Tool surface design decides how many doors you built in the first place

Lose any one of those layers and the system becomes brittle.

Layer three: public server hardening is an end-to-end concern

Once you are on the public internet, security is no longer just an app-code issue.
It becomes an end-to-end chain:

DNS / CDN / edge
TLS termination
reverse proxy
header forwarding
origin app
auth middleware
tool execution
response shaping

If any one of those layers violates your assumptions, the whole security model drifts.

This is why I now think about hardening in five layers

1. Edge layer

This covers:

TLS
first-line traffic controls
bot / abuse filtering
public endpoint health

2. Proxy layer

This covers:

whether /mcp paths survive intact
whether Authorization headers reach origin
timeout, buffering, and body-size behavior
how upstream errors are rewritten or hidden

3. Server layer

This covers:

auth middleware
scope and role checks
request validation
logging, tracing, and rate limits

4. Tool layer

This covers:

which tools are public
which tools are read-only
which tools have side effects
which expensive or dangerous tools need stronger guardrails

5. Response layer

This covers:

whether clients can branch on structured errors
whether stack traces or route names leak
whether payloads include unnecessary sensitive fields
whether responses are too verbose for public-facing use

Public server hardening checklist

Prompt injection is closer to execution than many teams assume

This deserves its own section.

If the model can look at external content and then decide whether to call a tool, prompt injection is not just a “chat quality” problem.
It can directly affect:

tool choice
privilege abuse
accidental write operations
the order in which sensitive resources are requested
whether the model is manipulated into gathering more context than it should

The official MCP security guidance calls out these risks very directly. So if you run a public server, I would at least do the following:

separate higher-risk write tools from read tools
make dangerous tool schemas stricter
avoid returning excessive detail that increases the attack surface of the next turn
apply stronger authz to privileged tools instead of trusting the model to “do the right thing”

In other words, tool safety cannot be outsourced to model goodwill.

The implementation order I trust more: shrink the surface, then permissions, then observability

If I had to compress the whole article into one engineering sequence, it would be this:

Shrink the public surface first
Do not expose every helper just because you can.
Then separate authn from authz
Authentication proves identity. Authorization constrains action.
Then add observability and governance
Logging, tracing, rate limits, golden prompts, and real-host testing.

That order is healthier than shipping first and hardening by patch.
Because it accepts a simple fact:

Once an MCP server is public, it is no longer just a Python app. It is an execution boundary shaped by models, hosts, proxies, and users at the same time.

A practical security decision ladder

A minimal but healthier FastMCP direction

This is not meant as production copy-paste code. It is a shape worth thinking from:

from fastmcp import FastMCP
from fastmcp.server.middleware import AuthMiddleware
from fastmcp.server.auth import require_scopes

mcp = FastMCP("Example Public Server")

mcp.add_middleware(
    AuthMiddleware(
        auth=lambda ctx: ctx.has_scope("mcp:read")
    )
)

@mcp.tool
@require_scopes("jobs:read")
def query_jobs(keyword: str, top_k: int = 10) -> dict:
    # strict validation + bounded response
    ...

The real point is not the decorator.
The real point is that you have already thought through:

which tools should exist at all
which scope should be able to see them
whether the response needs trimming
which higher-risk actions deserve separate treatment

The one line I most want to leave behind

A lot of teams think of public MCP security as:

How do I protect an MCP app?

I think the better framing is:

How do I draw an execution boundary between a model and the outside world that does not collapse just because a request had a token?

That boundary is what security, auth, and hardening are really for.