DRAFT-AGENT-FIRST-WEB

Agent-First Web Standard

A vendor-neutral specification for websites that serve AI agents as first-class users across four orthogonal environments.

Status
Draft · iterated in public
Overview
License
CC BY 4.0
Source

1Abstract

This document specifies a set of web surfaces, headers, and primitives that together enable an AI agent to discover, register with, authenticate against, and operate a website as a first-class user — without relying on screen scraping, browser emulation, or out-of-band coordination with a human.

The specification organizes requirements around four orthogonal agent environments — browser agents, headless coding agents, MCP-client agents, and in-site agents — rather than a single linear conformance ladder. A site MAY support one environment without supporting any other; each has its own minimum viable surface.

This document is a draft. It incorporates learnings from production deployments and explicitly identifies surfaces whose underlying standards have not converged.

2Conformance and terminology

The key words MUST, MUST NOT, SHOULD, and MAY in this document are to be interpreted as described in RFC 2119.

A "user agent" in this specification refers to any automated client — including but not limited to: large-language-model-based coding assistants (Claude Code, Codex, Cursor), browser-integrated tool-invoking assistants (MCP-aware sidebars), MCP client hosts (Claude Desktop, Cursor IDE), and first-party in-site agent surfaces.

An "origin" in this specification is as defined in RFC 6454. Surfaces specified at well-known paths MUST be served from the site's canonical origin.

Note

This specification does not define an agent-identity protocol. Section 6 discusses forward-compatibility with future agent-identity standards (such as RFC 8693 token exchange); implementers SHOULD follow the audit-log and soft-header guidance in §4.10 to remain migratable.

3Environments

A conforming site MAY support any subset of the four environments. Each environment has a defined minimum surface and a recommended surface. A site claiming conformance with an environment MUST meet the minimum for that environment and SHOULD meet the recommended.

Environment Minimum (MUST) Recommended (SHOULD)
E1. Browser agent
In-tab assistants using the user's existing session.
Same-origin API reachable from XHR/fetch with cookie credentials. JSON error bodies (§4.10). CORS (§4.10). WebMCP tool registration on relevant pages (§4.8). Page-context tools auto-register on navigation.
E2. Headless coding agent
Claude Code, Cursor background, Codex, scripts.
/AGENTS.md (§4.2). Programmatic signup (§4.3). Bearer-token API. JSON errors (§4.10). Content negotiation (§4.1). Re-retrievable PAT (§4.4). Credentials-file convention. Magic-link endpoint (§4.5). Coding-agent repo access (§4.9).
E3. MCP-client agent
Claude Desktop, Cursor IDE, MCP hosts.
At least one of /.well-known/mcp/server-card.json (SEP-1649) or /.well-known/mcp (SEP-1960) (§4.7). Both. Typed tool definitions. OAuth-compatible or header-passed auth. Documented setup.md URL.
E4. In-site agent
First-party agent bar, sidebar, or command palette.
Some first-party UI element that invokes agent behavior against the site's own API with the logged-in session. A "For Agents" navigation entry. Hand-off primitive ("Open in your agent"). Clear transitional framing — E4 fades when E1/E2 are ubiquitous.

4Patterns

4.1Content negotiation and the .md suffix

A site conforming to environment E2 MUST support at least one of:

  1. Responding to Accept: text/markdown on an HTML URL by returning the canonical markdown representation of that resource (typically the rendered page's markdown source with YAML frontmatter). The response MUST include Vary: Accept.
  2. Serving the same markdown representation at a URL formed by appending .md to the HTML URL.

Sites SHOULD support both. When supporting the .md suffix, the HTML response SHOULD advertise it via Link: <URL.md>; rel="alternate"; type="text/markdown".

Example
GET /pages/quickstart HTTP/1.1
Accept: text/markdown

HTTP/1.1 200 OK
Content-Type: text/markdown; charset=utf-8
Vary: Accept
Link: </pages/quickstart.md>; rel="alternate"; type="text/markdown"

---
title: Quickstart
author: jacob
updated: 2026-04-14
---
# Quickstart
...
Rationale

Empirical data from commercial CDNs in 2026 shows approximate 80% reduction in agent token consumption when agents request markdown directly instead of parsing HTML. This is the single highest-impact low-cost pattern in this specification.

4.2AGENTS.md at site root

A site conforming to environment E2 MUST publish a markdown document at /AGENTS.md containing, at minimum:

The file MAY contain additional information including CLI installation instructions, example workflows, credential-file conventions, and per-endpoint examples. A site MAY publish additional AGENTS.md documents at subpaths (e.g., /@user/project/AGENTS.md) to federate instructions for sub-sections of the site.

4.3Programmatic account registration

A site conforming to environment E2 MUST provide at least one endpoint that creates a new account and returns a credential usable for subsequent authenticated requests, without requiring:

If a site requires proof-of-work or rate-limiting at the signup boundary, it SHOULD accept a hashcash-style token computed by the client as an alternative to interactive challenges.

Email, if collected at all, SHOULD be treated as an optional affiliation field and MUST NOT block initial use of the API.

Minimal signup
POST /api/v1/accounts HTTP/1.1
Content-Type: application/json

{"username": "jacob-agent"}

HTTP/1.1 201 Created
Content-Type: application/json

{
  "user_id": "u_01HNKV...",
  "username": "jacob-agent",
  "api_key": "pat_live_abc123...",
  "client_config": {
    "credential_path": "~/.yourapp/credentials.json",
    "mode": "0600"
  }
}

4.4Re-retrievable personal access tokens

A site conforming to environment E2 SHOULD provide a mechanism by which an authenticated user (via password or other long-lived credential) can retrieve a current valid PAT without creating a new account.

Rationale: coding agents frequently lose ephemeral credentials due to workspace resets or session boundaries. Forcing re-signup pollutes the username space and correlates multiple identities to the same human user.

Re-retrieval
POST /api/v1/auth/key HTTP/1.1
Content-Type: application/json

{"username": "jacob-agent", "password": "..."}

HTTP/1.1 200 OK
{"api_key": "pat_live_abc123..."}

A site implementing this pattern SHOULD document a canonical filesystem location and file mode in the signup response's client_config, and MAY provide shell or language-specific snippets.

4.5Magic login links for agent-to-human handoff

A site conforming to environment E2 MAY expose an endpoint that, given a valid PAT, returns a short-lived single-use URL that logs the user into the browser without exposing the PAT in the URL path or query string.

This primitive enables an agent-to-human handoff pattern: an agent completes background work, then surfaces a URL the human can click to review, confirm, or continue interactively.

Magic link
POST /api/v1/auth/magic-link HTTP/1.1
Authorization: Bearer pat_live_abc123...
Content-Type: application/json

{"next": "/settings", "ttl": 600}

HTTP/1.1 200 OK
{
  "login_url": "https://example.com/auth/magic/xyz...",
  "expires_at": "2026-04-16T17:24:00Z"
}

The returned URL MUST be single-use and MUST expire within 24 hours; a TTL of 10 minutes is SHOULD-level default. The server MUST NOT accept the link after redemption.

4.6llms.txt and llms-full.txt

A site SHOULD publish a site description at /llms.txt following the conventions of llmstxt.org: an H1 title, a blockquote summary, followed by one or more H2 sections each containing a markdown link list. A site MAY additionally publish a full-content expansion at /llms-full.txt.

Advisory

Adoption of llms.txt by major LLM answer engines is measured at under 1%. Implementers are advised not to expect referral traffic from this surface. Its primary value is credibility, single-shot agent onboarding, and implementation discipline.

4.7MCP discovery surfaces

A site conforming to environment E3 MUST publish at least one of:

Sites SHOULD publish both; neither proposal has merged into the core MCP specification, and clients vary in which they probe.

Advisory

Both SEPs remain in flight as of publication. A site should treat them as forward-compatible shims; merge status and future spec changes may rename these paths. Implementers SHOULD monitor spec.modelcontextprotocol.io.

4.8WebMCP tool registration

A site conforming to environment E1 SHOULD register typed tools on relevant pages via the W3C WebMCP draft API (navigator.modelContext.registerTool()), gated on feature detection.

Registration MUST be idempotent across navigations. Tools that depend on page context SHOULD auto-register on the pages where they apply and deregister elsewhere. Tool execute handlers SHOULD use credentials: 'include' when calling same-origin API routes so the user's existing session handles authentication.

Tools that mutate server state SHOULD require user confirmation in the agent's UI surface. Silent destructive writes are MUST NOT-level.

Minimal registration
<script defer>
if ("modelContext" in window.navigator) {
  window.navigator.modelContext.registerTool({
    name: "search_pages",
    description: "Search pages in the current site.",
    inputSchema: {
      type: "object",
      properties: { query: { type: "string" } },
      required: ["query"]
    },
    execute: async ({ query }) => {
      const r = await fetch("/api/v1/search?q=" + encodeURIComponent(query), {
        credentials: "include"
      });
      const data = await r.json();
      return { content: [{ type: "text", text: JSON.stringify(data) }] };
    }
  });
}
</script>

4.9Coding-agent-with-filesystem pattern (informative)

Many sites that organize their canonical content as files in a git repository can offer agents a higher-level interface than RPC: a clone of the repository, a set of filesystem tools, and a push token. The agent operates on files and commits back.

This pattern is informative rather than normative because it does not define a wire protocol; it reuses existing git infrastructure. A site wishing to offer it SHOULD:

4.10Errors, CORS, and audit headers

Every 4xx and 429 response on an agent-facing endpoint MUST have a Content-Type of application/json (or application/problem+json per RFC 7807) and MUST include a machine-readable error body with at least the fields:

{
  "error": "rate_limited",
  "message": "Per-key quota exceeded.",
  "retry_after_seconds": 60,
  "docs_url": "https://example.com/api/docs#rate-limits"
}

HTTP 429 responses MUST include a Retry-After header. Responses that trigger quotas SHOULD include X-RateLimit-Remaining and X-RateLimit-Reset headers.

Agent-facing endpoints (those documented in AGENTS.md, llms.txt, or the MCP server card) MUST send Access-Control-Allow-Origin: * on GET responses, or reflect the requesting origin if credentials are required.

Agent-facing endpoints SHOULD accept and log (but not enforce) a soft header X-Agent-Name identifying the agent software. Implementations SHOULD retain this value in audit logs and MAY expose it to users of the resource being acted upon.

5Badges

A conforming site MAY display per-environment badges reflecting the surfaces it has implemented. Badges are awarded per environment, not cumulatively:

BadgeCriteria (all MUST be met)
Markdown Native§4.1
Headless Ready§4.2, §4.3, §4.10
MCP Server§4.7 (at least one SEP)
WebMCP Ready§4.8, §4.1, §4.10
Agent UXE4 minimum, plus a published "For Agents" nav entry and a hand-off primitive

A future revision of this specification will define an automated verifier (/check?url=…) to award badges based on crawled evidence. Until that verifier ships, badges are self-asserted and implementations SHOULD link to the specific URLs demonstrating each badge's criteria.

6Security considerations

Signup abuse. Programmatic signup (§4.3) expands the abuse surface. Sites SHOULD rate-limit signup by IP, log X-Agent-Name, and reject obviously automated burst patterns. A hashcash token offers a cost gradient that humans and agents handle equivalently.

PAT exposure. Re-retrieval (§4.4) trades one-shot secrecy for operational resilience. Sites SHOULD allow users to rotate all PATs on demand and SHOULD scope PATs narrowly.

Magic link attacks. The endpoint in §4.5 is an account-takeover vector if TTL or single-use semantics are not enforced. Implementations MUST invalidate a link on redemption and MUST expire all outstanding links on password change.

WebMCP write amplification. Browser-registered tools inherit the user's session (§4.8). Malicious pages can register tools; the browser SHOULD surface the tool provenance to the user. Servers SHOULD rate-limit per session even when the request appears human-origin.

Forward compatibility with agent identity. This specification does not define an agent-identity protocol. A future version is expected to profile either RFC 8693 token exchange or an equivalent primitive. Sites adopting §4.10 (audit-logged X-Agent-Name) are positioned to migrate without breaking clients.

7Example conforming sites

The following are self-described conforming implementations. They are listed as examples, not endorsements; inclusion does not imply automated verification.

SiteEnvironmentsNotes
WikiHub E1 (partial), E2, E3, E4 Reference implementation. Ships all ten patterns. Source of the coding-agent pattern in §4.9 via its Curator feature.
ListHub E1, E2, E3, E4 Earlier sibling project. WebMCP (§4.8) implementation with page-context tools. /AGENTS.md, /llms.txt, /.well-known/mcp/server-card.json.

Implementers who wish to be listed as conforming examples may open a pull request against github.com/tmad4000/agentfirst-web.

8Normative references

Informative references

Editorial note. This is a living draft. Comments and pull requests are welcome at the spec repository. A companion non-normative overview with a reference implementation is available at the landing page.