Specification as Attack Surface: Why Ambiguity Is a Vulnerability in Agent-First Architectures

February 18, 202615 min readAtypical Tech

Updated March 12, 2026

security agents specifications appsec safe-autonomy

Illustration for Specification as Attack Surface: Why Ambiguity Is a Vulnerability in Agent-First Architectures

In The Interface Security Imperative, we mapped the attack surfaces that emerge when agents interact with external systems — APIs, tools, data sources, communication channels. Every tool an agent can call is an attack surface. Every integration point is a security boundary.

But there's an attack surface upstream of all of those. One that exists before the agent makes its first API call, before it processes its first input, before it executes a single action.

The specification.

The most dangerous security decisions are the ones nobody made.

Specifically: the gaps in the specification. The places where the requirements are ambiguous, incomplete, or simply absent. In traditional software, those gaps are a project management problem — rework, missed deadlines, frustrated engineers. In agent-first architectures, they're a security problem. Every ambiguous requirement is a decision the agent will make silently, without flagging that it's guessing.

This isn't a new vulnerability class. It's a known problem — one that software engineering has studied for decades — manifesting in a context where the consequences are fundamentally different.

The known problem

Requirements ambiguity has been studied since the field of software engineering had a name. Boehm and Basili's foundational research established that finding and fixing a software defect after delivery costs up to 100x more than catching it during requirements. They identified two major sources of avoidable rework: hastily-specified requirements and nominal-case-only development where off-nominal requirements get accommodated late. Software projects spend 40-50% of their effort on avoidable rework, and 80% of that traces back to 20% of the defects — most of which originate in the requirements.

ISO/IEC/IEEE 29148, the international standard for requirements engineering, defines an unambiguous requirement as one that has exactly one interpretation. The earlier IEEE 830 standard made the same point: a specification is unambiguous only when every requirement has a single interpretation.

Here's where it gets interesting. Research on detecting ambiguities in requirements documents found that engineers often don't notice ambiguity at all — they quietly assume a specific interpretation and proceed. The researchers documented a case where a misinterpreted requirement in a safety-critical system meant that dangerous conditions would not trigger a shutoff. The interpretation was a function of the reader's background, not the specification's clarity.

Ambiguity doesn't announce itself. It just picks a meaning and keeps walking.

None of this is new. What's new is what happens when the "reader" is an agent.

What changes when the reader is an agent

A human engineer who encounters an ambiguous requirement has options: ask a colleague, check the wiki, file a ticket, make an assumption and document it, or escalate. Most of these create a signal that ambiguity was encountered. Even the worst case — making a silent assumption — produces an artifact that another human will eventually review.

An agent does something different. It infers, proceeds, and produces confident-looking output with no signal that it guessed.

Humans hedge. Agents don't.

The OWASP Top 10 for LLM Applications (2025) formalized this as LLM09: Misinformation. When an LLM encounters ambiguity or gaps in its instructions, it rarely admits uncertainty — instead filling the gap with a statistically plausible answer. The outputs use the correct tone, structure, and terminology. Nothing signals the information was fabricated or that the agent chose one interpretation over another.

Here's the mechanism that makes specification ambiguity a security concern:

The agent encounters an underspecified requirement. "Handle authentication" doesn't specify OAuth, JWT, session cookies, API keys, or SAML. "Process user data" doesn't define what "process" means, what "user data" includes, or what to do with edge cases.
The agent selects an interpretation. Not randomly — based on patterns in its training data. But the selection isn't informed by your security policy, your compliance requirements, your threat model, or your business context. It's informed by what was statistically common in the data the model was trained on.
The agent proceeds with confidence. The output looks correct. The code compiles. The workflow runs. No uncertainty flag, no "I assumed X because the spec didn't say" annotation, no signal that a security-relevant decision was made by inference rather than by specification.
The gap between what was specified and what was built becomes a vulnerability. Not because the agent did something wrong — it did exactly what it was designed to do. But because the decision it made may violate a security boundary, a compliance requirement, or an architectural constraint that was never specified.

This pattern isn't adversarial. It doesn't require an attacker. It's the default behavior of any system that fills gaps in its instructions with inferred intent.

You don't need a threat actor when the spec has holes. The agent will fill them for you.

The silence is the danger

In traditional software, ambiguous requirements create bugs. Bugs are visible — they cause failures, generate error messages, break tests. The feedback loop is slow and expensive, but it eventually surfaces the problem.

In agent-first architectures, ambiguous specifications create silent boundary violations. The agent's output is functional. It passes tests — at least, the tests derived from the same incomplete specification. The violation is invisible until someone discovers it through a security review, a compliance audit, or an incident.

The scariest output is the one that works perfectly and was never specified.

OWASP LLM06: Excessive Agency addresses a related dimension: agents acting beyond their intended scope because they were granted capabilities without adequate constraints. The specification is where those constraints should originate. When the specification is ambiguous about what an agent should and should not do, the agent defaults to its broadest interpretation of its capabilities.

Consider these scenarios — not exploitation techniques, but normal agent behavior when specifications are incomplete:

Authentication: The spec says "implement user authentication." The agent chooses JWT with a 24-hour expiration, stores tokens in localStorage, and uses a symmetric signing key. Each of those decisions has security implications. None of them were specified. Each reflects a common pattern in training data, not a security-informed architectural choice.

Data handling: The spec says "collect user preferences." The agent collects preferences and logs them — including the request metadata, IP addresses, and session identifiers that happened to be in the same data structure. The spec didn't say what to collect. It also didn't say what not to collect.

Error handling: The spec says "handle errors gracefully." The agent returns detailed error messages including stack traces, internal paths, and database schema information. Graceful, in the training data, often means verbose. The spec didn't define the boundary between graceful and information-leaking.

Third-party integration: The spec says "send notifications." The agent calls an external service with a request body that includes user data the service doesn't need. The spec didn't define a data minimization boundary for outbound integrations.

In each case, the agent produced working software that met the literal requirement. In each case, the result contains a security-relevant decision that was never made by a human.

Why this is an attack surface, not just a bug class

A bug is an error in implementation. An attack surface is a point where a system can be probed, tested, or exploited.

Specification ambiguity is an attack surface because:

It's systematic. Every ambiguous requirement produces the same class of risk — an agent-inferred decision with security implications. Ten ambiguous requirements? Ten silent security decisions. Research on requirements engineering consistently shows that ambiguity is pervasive in natural-language specifications — it's the norm, not the exception.

It's predictable. An adversary who understands how a model interprets ambiguous specifications can predict the decisions the agent will make. The patterns aren't random; they reflect the training data distribution. This makes agent-inferred decisions a structured, analyzable attack surface — not a random bug.

If your attacker can read the same training data your agent learned from, they can predict every gap you left open.

It scales with agent adoption. Every additional agent consuming underspecified requirements multiplies the number of silent security decisions. In multi-agent systems, OWASP's agentic threat analysis notes that misinformation compounds through memory, tool use, and inter-agent interactions — one agent's inferred decision becomes another agent's assumed fact.

It's invisible to conventional testing. Tests derived from the same incomplete specification will pass, because the tests encode the same assumptions the agent made. Negative testing — verifying what the system should not do — requires a complete specification of prohibited behavior. If the specification doesn't define what's out of bounds, you can't test for boundary violations.

The NIST Secure Software Development Framework (SP 800-218) makes threat modeling an explicit practice in secure development. But threat modeling requires a specification to model against. When the specification is incomplete, the threat model has blind spots that mirror the specification's gaps — and the agent will fill those blind spots with inferred behavior that was never modeled.

Your tests pass because they inherited the same blind spots as your spec.

ROBOT framework mapping

This attack surface maps primarily to two ROBOT framework components:

Boundaries

The specification is where boundary definitions originate. A Boundary specification defines what an agent can and cannot do — its permitted actions, its data access scope, its interaction constraints. When the specification is ambiguous about boundaries, the agent operates with inferred boundaries. Inferred boundaries are, by definition, not enforced boundaries.

As we established in Interfaces Define Capability and Risk: the interface is the boundary of safe action. The specification is the document that defines where those boundaries are. Gaps in one produce gaps in the other.

Taskflow

Taskflow defines the expected sequence of operations, the conditions under which each step executes, and the constraints that govern the overall workflow. An ambiguous specification produces ambiguous taskflow — the agent has latitude to reorder operations, skip steps it considers unnecessary, or add steps it considers helpful. Each of those decisions may be reasonable in isolation and problematic in combination.

An agent that reorders your workflow isn't disobedient. It's under-constrained.

The pattern described in Structure Beats Prompts applies directly: structural constraints outperform natural-language instructions because they're unambiguous by design. A taskflow contract that says "validate input, then query, then filter, then return" leaves less room for interpretation than a specification that says "look up the user's data and give them what they need."

What specification verification needs to look like

Let's be direct: this is an open problem.

Current approaches to specification verification — formal methods, property-based testing, contract verification — were designed for systems where a human writes the specification, a human writes the implementation, and another human verifies the correspondence. They assume the implementer can be queried about their interpretation. They assume ambiguity is detectable through code review.

Agent-first architectures break those assumptions. The implementer can't explain why it chose one interpretation over another. The code it produces looks intentional. The standard review process — "does the implementation match the spec?" — fails when both the reviewer and the agent read the same ambiguous spec and converge on the same interpretation.

Code review doesn't catch ambiguity when the reviewer shares the same blind spots.

What we need is verification that works in the other direction: starting from the agent's output and working backward to identify where inferred decisions were made. That requires:

Specification coverage analysis. For every security-relevant decision in the agent's output, trace it back to a specific requirement. If the decision can't be traced to an explicit requirement, it was inferred — and it needs human review.
Boundary enumeration. Every specification should explicitly enumerate what is out of bounds, not just what is in scope. The OWASP Attack Surface Analysis Cheat Sheet recommends this for applications; it's even more critical when the implementer can't distinguish between "not mentioned because it's out of scope" and "not mentioned because it's obvious."
Inference flagging. Agent systems should signal when they fill a specification gap. This is an Observability requirement: the audit trail should distinguish between actions taken because they were specified and actions taken because they were inferred. The gap between the two is the specification's attack surface.
Adversarial specification review. Before an agent consumes a specification, review it from the perspective of "what decisions will the agent have to make that aren't specified here?" This is threat modeling applied to the specification itself — not the system the specification describes, but the specification as an input to an agent.

These approaches are in development across the industry, including at Atypical Tech. The NIST AI Risk Management Framework (AI 100-1) calls for organizations to establish policies and procedures to address AI risks, including risks that emerge from the gap between intended and actual system behavior. Specification ambiguity is a primary source of that gap.

We don't have a complete solution. But we have a clear problem definition, established research that validates the risk, and a framework — ROBOT — that provides the structural components for addressing it.

What you can do now

You don't need to wait for the verification problem to be solved to reduce your specification attack surface. These practices apply today:

Audit your specifications for security-relevant ambiguity. Every requirement that touches authentication, authorization, data handling, external integrations, or error behavior should be specific enough that two different agents would produce the same implementation. If a requirement can be interpreted two ways, it will be.
Define what's out of bounds, not just what's in scope. For every agent-consumed specification, include explicit constraints: "Do not store tokens in client-side storage." "Do not include PII in log output." "Do not call external services with data beyond [enumerated fields]." Agents are excellent at following explicit constraints. They're unreliable at inferring them.
Treat agent-inferred decisions as unreviewed code. Any decision the agent made that can't be traced to an explicit requirement should go through the same review process as unreviewed code from an unknown contributor. Because that's exactly what it is.
Separate specification from implementation. Don't let the agent that writes the implementation also interpret the specification. Use a structured specification format — Taskflow contracts, schema definitions, boundary enumerations — that reduces the interpretation space before the agent begins.
Add specification completeness to your threat model. When you threat model a system that involves agent implementation, include "incomplete specification" as a threat source. For each identified threat, ask: "Is the defense against this threat specified, or will the agent have to infer it?"
Monitor for specification drift. As agents produce output over time, the gap between the original specification and the agent's evolving interpretation can widen. Regular reconciliation between the spec and actual behavior is an Observability practice that catches drift before it becomes a vulnerability.

The best time to close a spec gap is before the agent reads it. The second best time is now.

The specification is the first security boundary

In The Interface Security Imperative, we concluded that the integration layer is the primary security boundary in agent-first architectures. That remains true. But the specification is what defines what that boundary should look like.

When the specification is complete, the boundary is defined. When the specification is ambiguous, the boundary is inferred. When the boundary is inferred, it's not a boundary — it's an assumption.

Assumptions, in security, are vulnerabilities.

This isn't a novel observation. It's the application of a well-established principle — ambiguous requirements produce defective software — to a context where the defects are silent, the implementer can't explain its reasoning, and the cost of late detection is compounded by the speed and scale at which agents operate.

The teams that will build secure agent systems aren't just the ones that secure every API and monitor every tool invocation. They're the ones that recognize the specification itself as a security artifact — and treat every gap in it as a vulnerability to be closed before the agent starts.

Evaluate your own agent specifications. The Safe Autonomy Readiness Checklist covers specification completeness as part of its 43-item assessment across 8 sections.

Building agent systems and want to ensure your specifications are security-complete? We should talk.

Contact Atypical Tech

References

Boehm & Basili, Software Defect Reduction Top 10 List — IEEE Computer, 2001. Cost-of-change curve: requirements defects 100x more expensive to fix post-delivery
ISO/IEC/IEEE, 29148:2018 Systems and Software Engineering — Life Cycle Processes — Requirements Engineering — international standard defining unambiguous requirements
Kamstie, Berry & Paech, Detecting Ambiguities in Requirements Documents Using Inspections — silent misinterpretation of ambiguous safety-critical requirements
OWASP, LLM09:2025 Misinformation — hallucination, confident-but-wrong outputs, agentic amplification
OWASP, LLM06:2025 Excessive Agency — agents acting beyond intended scope due to insufficient constraints
OWASP, Attack Surface Analysis Cheat Sheet — boundary enumeration for application security
NIST, SP 800-218: Secure Software Development Framework — threat modeling as explicit secure development practice
NIST, AI Risk Management Framework (AI 100-1) — managing risks from gaps between intended and actual AI system behavior
Berry, Kamsties & Krieger, From Contract Drafting to Software Specification: Linguistic Sources of Ambiguity — pervasiveness of ambiguity in natural-language specifications
Rule 11 Reader, The Insecurity of Ambiguous Standards — how ambiguous protocol specifications produce security vulnerabilities in implementations