Skip to content

Latest commit

 

History

History
584 lines (461 loc) · 26.5 KB

File metadata and controls

584 lines (461 loc) · 26.5 KB

04 - Messages

Status: Draft Version: 0.1.2

Message Structure

Every message has two parts:

  1. Envelope - Routing metadata (visible to providers)
  2. Payload - Message content (can be encrypted)
{
  "envelope": {
    "version": "amp/0.1",
    "id": "msg_1706648400_abc123",
    "from": "alice@acme.crabmail.ai",
    "to": "bob@acme.crabmail.ai",
    "subject": "Question about the API",
    "priority": "normal",
    "timestamp": "2025-01-30T10:00:00Z",
    "expires_at": "2025-02-06T10:00:00Z",
    "signature": "base64_encoded_signature",
    "in_reply_to": null,
    "thread_id": "msg_1706648400_abc123"
  },
  "payload": {
    "type": "request",
    "message": "Can you review the authentication changes?",
    "context": {
      "repo": "agents-web",
      "branch": "feature/oauth"
    }
  }
}

Envelope Fields

Field Type Required Description
version string Yes Protocol version (e.g., "amp/0.1"). See note below.
id string Yes Unique message identifier
from string Yes Sender's full address
to string Yes Recipient's full address
subject string Yes Message subject (max 256 chars)
priority enum No urgent, high, normal, low (default: normal)
timestamp string Yes ISO 8601 timestamp
expires_at string No ISO 8601 expiration time; agents and providers SHOULD reject expired messages
signature string Yes Base64-encoded signature
in_reply_to string No Message ID this replies to
thread_id string Yes ID of first message in thread
idempotency_key string No Client-generated deduplication key (see Idempotency)

Protocol Version

The version field identifies which version of the AMP protocol the message conforms to. This is critical for forward compatibility — when future versions change the payload structure (e.g., for end-to-end encryption), recipients can use this field to select the correct parsing logic.

Current version: "amp/0.1"

Note: amp/0.1 is the message envelope version. The discovery protocol uses AMP1 in DNS TXT records and well-known documents. The protocol specification version (e.g., 0.1.2) tracks the spec document; the config format version (e.g., 1.1) tracks local storage schema. These version numbers are independent.

Message Expiration

The optional expires_at field specifies when a message should be considered stale. When present:

  • Agents SHOULD reject messages where expires_at is in the past.
  • Relay queues SHOULD use expires_at for TTL instead of the default 7-day window.
  • If absent, the relay queue's default TTL applies.

Message ID Format

msg_<unix_timestamp>_<random_suffix>

Examples:
msg_1706648400_abc123
msg_1706648400_xyz789def

Priority Levels

Priority Use Case Delivery
urgent Critical alerts, security issues Immediate, no delay
high Important but not critical Prioritized delivery
normal Standard communication Normal delivery
low FYI, non-time-sensitive May be batched

Payload Fields

Field Type Required Description
type string Yes Message type (see below)
message string Yes Main message body
context object No Structured context data
attachments array No File attachments referenced by the message (see Attachments)

Message Types

Type Description Example Use
request Asking for something "Can you review this code?"
response Reply to a request "Here's my review..."
notification FYI, no response needed "Build completed successfully"
alert Important notice "Security vulnerability detected"
task Assigned work item "Please implement feature X"
status Status update "Task 50% complete"
handoff Transferring context "Passing this to you with context..."
ack Acknowledgment "Received, working on it"
update Progress or state change "Deployment 80% complete"
system Provider or system notification "Agent registered successfully"

Custom Types

Agents MAY define custom types with a namespace prefix:

{
  "type": "github:pull_request",
  "message": "New PR opened",
  "context": {
    "pr_number": 123,
    "title": "Add OAuth support"
  }
}

Context Object

The context field carries structured data relevant to the message:

{
  "type": "task",
  "message": "Please review the authentication implementation",
  "context": {
    // Git context
    "repo": "github.com/23blocks/agents-web",
    "branch": "feature/oauth",
    "commit": "abc123",

    // Files involved
    "files": [
      "lib/auth.ts",
      "api/login/route.ts"
    ],

    // Related issues/PRs
    "references": [
      { "type": "issue", "id": "42" },
      { "type": "pr", "id": "123" }
    ],

    // Custom data
    "deadline": "2025-02-01",
    "priority_reason": "Security audit next week"
  }
}

Providers MUST preserve the context object as-is; they MUST NOT modify or validate its contents.

Attachments

Messages MAY include file attachments. Attachment file content is stored externally by the provider; only metadata appears in the message JSON. The attachments array lives inside the payload, so it is automatically covered by payload_hash in the message signature. No changes to the signing process are needed.

Attachment Signing Flow

Because the payload_hash covers the entire serialized payload (including the attachments array), the client MUST build the complete payload — with all attachment metadata fields — before signing. The recommended flow is:

  1. Upload each file via POST /v1/attachments/upload and POST /v1/attachments/{id}/confirm.
  2. Poll GET /v1/attachments/{id} until scan_status is clean or suspicious.
  3. Retrieve the full attachment object (including provider-assigned url, scan_status, uploaded_at, expires_at).
  4. Build the payload object with the complete attachments array.
  5. Compute payload_hash = Base64(SHA256(JSON.stringify(payload))).
  6. Sign the canonical string and send via /v1/route.

Providers MUST NOT modify attachment fields within the payload after the message is routed. Provider-side metadata (such as security scan details) belongs in local.security, not in the payload.

Attachment Object

The attachments array is a field within the payload object:

{
  "payload": {
    "type": "request",
    "message": "Here are the logs.",
    "attachments": [
      {
        "id": "att_1706648400_abc123",
        "filename": "puma.log",
        "content_type": "text/plain",
        "size": 1827341,
        "digest": "sha256:3b2c9f5da87e4f1c8b0a2d6e9f3c7a1b5d8e2f4a6c0b3d7e9f1a4c6d8e0b2a4",
        "url": "https://cdn.crabmail.ai/attachments/att_1706648400_abc123?token=<signed_token>",
        "scan_status": "clean",
        "uploaded_at": "2025-01-30T09:58:00Z",
        "expires_at": "2025-02-06T10:00:00Z"
      }
    ]
  }
}

Attachment Fields

Field Type Required Description
id string Yes Provider-assigned attachment ID (att_<timestamp>_<hex>)
filename string Yes Original filename (max 255 characters, sanitized)
content_type string Yes MIME type (e.g., text/plain, application/pdf)
size integer Yes File size in bytes
digest string Yes Content hash in the format <algorithm>:<hex> (currently sha256:<hex>; see Digest Algorithm)
url string Yes Provider-signed download URL
scan_status enum Yes Security scan result: pending (upload in progress), clean, basic_clean (required checks passed but no AV scan), unscanned (local delivery, no provider scan), suspicious, or rejected
uploaded_at string Yes ISO 8601 timestamp of when the file was uploaded
expires_at string Yes ISO 8601 expiration timestamp (set by the agent, MUST be at least 7 days from upload time to ensure relay queue compatibility; providers MUST NOT modify this field after routing — see Attachment Signing Flow)

Attachment Rules

  • Maximum 10 attachments per message.
  • Maximum 25 MB per individual attachment.
  • Maximum 100 MB total attachment size per message.
  • Providers MUST NOT route messages where any attachment has scan_status: rejected.
  • In routed message payloads, scan_status MUST be clean, basic_clean, or suspicious — never pending, unscanned, or rejected. The pending status is valid only in upload API responses before routing. The unscanned status is valid only for local filesystem delivery where no provider scanning infrastructure is available. The basic_clean status indicates that required checks (size, digest, MIME, magic bytes) passed but no antivirus scan was performed.
  • Filenames MUST NOT contain path separators (/, \), null bytes, or control characters. Providers MUST sanitize filenames by stripping or replacing characters not in the set [a-zA-Z0-9._-]. Filenames MUST NOT match reserved OS names (CON, PRN, AUX, NUL, COM1-COM9, LPT1-LPT9 on Windows). Leading and trailing dots and spaces MUST be stripped. Double-encoded path separators (e.g., %2F) MUST be rejected.
  • Attachment IDs follow the format att_<unix_timestamp>_<random_hex>. Agents and providers MUST validate attachment IDs against path traversal (reject IDs containing /, \, .., or null bytes) before using them in filesystem paths.
  • Each attachment ID MUST be referenced by at most one message. Providers MUST reject a /route request that references an attachment ID already associated with a previously routed message. Retrying the same /route request (same message, same attachments) after a transient failure does not count as reuse.
  • Agents MAY include an idempotency_key field in the route request to enable safe retries. Providers receiving a route request with the same idempotency_key MUST treat it as a retry of the original request and return the same response without consuming attachment references again.
  • The 7-day attachment TTL starts when the message is routed, not when the file is uploaded. The expires_at value in the payload is set by the sending agent at upload time and MUST NOT be modified by providers after routing (modifying payload fields would invalidate the message signature).
  • Providers MUST delete uploaded attachments that are not referenced by a routed message within 2 hours of upload confirmation. This prevents orphaned files from consuming storage indefinitely while allowing sufficient time for multi-attachment upload workflows.

Example Message with Attachments

{
  "envelope": {
    "version": "amp/0.1",
    "id": "msg_1706648400_def456",
    "from": "alice@acme.crabmail.ai",
    "to": "bob@acme.crabmail.ai",
    "subject": "Server logs from last night",
    "priority": "high",
    "timestamp": "2025-01-30T10:00:00Z",
    "expires_at": "2025-02-06T10:00:00Z",
    "signature": "base64_encoded_signature",
    "in_reply_to": null,
    "thread_id": "msg_1706648400_def456"
  },
  "payload": {
    "type": "request",
    "message": "Here are the Puma logs and the error screenshot. Can you take a look?",
    "context": {
      "repo": "agents-web",
      "environment": "production"
    },
    "attachments": [
      {
        "id": "att_1706648400_abc123",
        "filename": "puma.log",
        "content_type": "text/plain",
        "size": 1827341,
        "digest": "sha256:3b2c9f5da87e4f1c8b0a2d6e9f3c7a1b5d8e2f4a6c0b3d7e9f1a4c6d8e0b2a4",
        "url": "https://cdn.crabmail.ai/attachments/att_1706648400_abc123?token=<signed_token>",
        "scan_status": "clean",
        "uploaded_at": "2025-01-30T09:58:00Z",
        "expires_at": "2025-02-06T10:00:00Z"
      },
      {
        "id": "att_1706648400_def456",
        "filename": "error-screenshot.png",
        "content_type": "image/png",
        "size": 245760,
        "digest": "sha256:a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6e7f8a9b0c1d2e3f4a5b6c7d8e9f0a1b2",
        "url": "https://cdn.crabmail.ai/attachments/att_1706648400_def456?token=<signed_token>",
        "scan_status": "clean",
        "uploaded_at": "2025-01-30T09:59:00Z",
        "expires_at": "2025-02-06T10:00:00Z"
      }
    ]
  }
}

Message Signing

All messages MUST be signed by the sending agent (not the provider). Signing is REQUIRED for cross-provider and cross-host messages. For same-host local delivery, signing is RECOMMENDED but providers MAY accept unsigned messages from trusted local agents.

Canonical Signature Format (v1.1)

Version 1.1 Update: The signature format was changed from full canonical JSON to selective field signing. This allows clients to sign messages before the server adds metadata (id, timestamp) and enables signatures to survive federation hops unchanged.

Providers MUST generate envelope.id and envelope.timestamp at message acceptance time. Clients MUST NOT include these fields in route requests; if present, providers MUST overwrite them.

The canonical string for signing is:

{from}|{to}|{subject}|{priority}|{in_reply_to}|{payload_hash}

Where:

  • {from} - Sender's full AMP address
  • {to} - Recipient's full AMP address
  • {subject} - Message subject (UTF-8)
  • {priority} - Priority level (low, normal, high, urgent)
  • {in_reply_to} - Message ID being replied to, or empty string if not a reply
  • {payload_hash} - Base64(SHA256(JSON.stringify(payload)))

Note: thread_id is intentionally NOT included in the signing canonical string. Recipients SHOULD derive thread grouping from in_reply_to chains rather than trusting thread_id from the wire, as it can be modified by intermediaries without invalidating the signature.

Example:

alice@acme.crabmail.ai|bob@acme.crabmail.ai|Hello|normal||K7gNU3sdo+OL0wNhqoVWhr3g6s1xYv72ol/pe/Unols=

Rationale for Selective Signing

  1. Client-side signing - Only the sender can create a valid signature (providers cannot forge messages)
  2. Server metadata - Servers add id and timestamp which the client cannot know in advance
  3. Federation integrity - Signature survives provider hops unchanged
  4. Attack prevention - Priority and in_reply_to are signed to prevent escalation and thread hijacking
  5. Standards alignment - Follows DKIM and HTTP Signatures patterns

Signature Process

import json
import hashlib
from base64 import b64encode

def sign_message(from_addr, to_addr, subject, priority, in_reply_to, payload, private_key, algorithm="Ed25519"):
    # 1. Calculate payload hash
    # Keys MUST be sorted lexicographically at all nesting levels
    payload_json = json.dumps(payload, separators=(',', ':'), sort_keys=True)
    payload_hash = b64encode(hashlib.sha256(payload_json.encode()).digest()).decode()

    # 2. Build canonical string
    canonical = f"{from_addr}|{to_addr}|{subject}|{priority}|{in_reply_to or ''}|{payload_hash}"
    canonical_bytes = canonical.encode('utf-8')

    # 3. Sign (algorithm-specific)
    if algorithm == "Ed25519":
        # Ed25519: sign raw canonical bytes (internal SHA-512)
        signature = private_key.sign(canonical_bytes)
    else:
        # RSA / ECDSA: sign SHA-256 hash
        digest = hashlib.sha256(canonical_bytes).digest()
        signature = private_key.sign(digest)

    # 4. Encode
    return b64encode(signature).decode()

Implementations MUST serialize payload JSON with keys sorted lexicographically at all nesting levels before hashing (equivalent to sort_keys=True in Python's json.dumps).

Implementation Note: Tools such as jq -c append a trailing newline to output. Implementations MUST strip all trailing whitespace and newline characters from the serialized payload string before hashing.

Bash/OpenSSL Example:

# Calculate payload hash
PAYLOAD_HASH=$(echo -n '{"type":"notification","message":"Hello"}' | \
    openssl dgst -sha256 -binary | base64 | tr -d '\n')

# Build canonical string
SIGN_DATA="alice@provider.com|bob@provider.com|Hello|normal||${PAYLOAD_HASH}"

# Sign with Ed25519 (requires -rawin flag)
echo -n "$SIGN_DATA" > /tmp/msg.txt
openssl pkeyutl -sign -inkey private.pem -rawin -in /tmp/msg.txt | base64 | tr -d '\n'

Signature Verification

Recipients MUST verify signatures before trusting a message:

  1. Fetch sender's public key from their provider (or use cached key)
  2. Recreate the canonical string from message fields
  3. Verify according to key algorithm
import json
import hashlib
from base64 import b64decode, b64encode

def verify_message(envelope, payload, sender_public_key, algorithm="Ed25519"):
    # 1. Extract signature
    signature = b64decode(envelope["signature"])

    # 2. Calculate payload hash
    # Keys MUST be sorted lexicographically at all nesting levels
    payload_json = json.dumps(payload, separators=(',', ':'), sort_keys=True)
    payload_hash = b64encode(hashlib.sha256(payload_json.encode()).digest()).decode()

    # 3. Recreate canonical string
    canonical = (
        f"{envelope['from']}|"
        f"{envelope['to']}|"
        f"{envelope['subject']}|"
        f"{envelope.get('priority', 'normal')}|"
        f"{envelope.get('in_reply_to', '')}|"
        f"{payload_hash}"
    )
    canonical_bytes = canonical.encode('utf-8')

    # 4. Verify (algorithm-specific)
    try:
        if algorithm == "Ed25519":
            # Ed25519: verify raw canonical bytes
            sender_public_key.verify(signature, canonical_bytes)
        else:
            # RSA / ECDSA: verify SHA-256 hash
            digest = hashlib.sha256(canonical_bytes).digest()
            sender_public_key.verify(signature, digest)
        return True
    except InvalidSignature:
        return False

Bash/OpenSSL Verification:

# Reconstruct canonical string from received message
PAYLOAD_HASH=$(echo -n "$PAYLOAD_JSON" | openssl dgst -sha256 -binary | base64 | tr -d '\n')
SIGN_DATA="${FROM}|${TO}|${SUBJECT}|${PRIORITY}|${IN_REPLY_TO}|${PAYLOAD_HASH}"

# Verify with Ed25519 (requires -rawin flag)
echo -n "$SIGN_DATA" > /tmp/verify.txt
echo "$SIGNATURE" | base64 -d > /tmp/sig.bin
openssl pkeyutl -verify -pubin -inkey sender_public.pem -rawin -in /tmp/verify.txt -sigfile /tmp/sig.bin

Sender Address Validation

Providers MUST verify that the from field in the envelope matches the authenticated agent's registered address before routing the message. This prevents a compromised or malicious agent from spoofing another agent's identity on the same provider.

Threading

Messages can form threads for conversations:

msg_001 (thread_id: msg_001, in_reply_to: null)
  └── msg_002 (thread_id: msg_001, in_reply_to: msg_001)
      └── msg_003 (thread_id: msg_001, in_reply_to: msg_002)
  └── msg_004 (thread_id: msg_001, in_reply_to: msg_001)
  • thread_id: Always the ID of the first message
  • in_reply_to: The specific message being replied to

Local Storage

Messages are stored locally on the agent's machine:

~/.agent-messaging/
├── messages/
│   ├── inbox/
│   │   └── <sender>/
│   │       └── msg_<id>.json
│   └── sent/
│       └── <recipient>/
│           └── msg_<id>.json
└── attachments/
    └── <att_id>/
        └── <filename>

Note: The attachments/ directory is at the top level of the agent storage directory, not nested under messages/. This allows attachments to be referenced by messages in both inbox and sent folders.

When downloading attachments, agents MUST verify that SHA256(downloaded_bytes) matches the digest field in the attachment metadata before processing the file content.

Agents SHOULD restrict attachment directories to permissions 0700 (owner only). Agents MUST clean up downloaded attachments when the parent message is deleted. Agents SHOULD also periodically remove downloaded attachments whose parent message expires_at has passed. Cleanup may be triggered on message deletion, on inbox list, or via a scheduled background task. Providers need not be notified of client-side attachment deletion.

Digest Algorithm

The digest field uses a prefixed format: <algorithm>:<hex>. The current protocol version requires sha256. Future versions MAY add sha384 or sha512 prefixes. Implementations MUST reject digest values with unrecognized algorithm prefixes rather than silently ignoring the prefix.

Rationale: The sha256:<hex> format follows the Docker/OCI content-addressable storage convention. This differs from W3C Subresource Integrity (sha256-<base64>) and the IETF Digest HTTP header (SHA-256=<base64>) which use base64 encoding. Hex encoding was chosen for consistency with Docker registries and for easier debugging (hex digests are more readable in logs).

Stored Message Format

{
  "envelope": { ... },
  "payload": { ... },
  "local": {
    "received_at": "2025-01-30T10:00:05Z",
    "status": "unread",
    "read_at": null,
    "delivery_method": "websocket",
    "verified": true
  },
  "security": {
    "trust_level": "external",
    "injection_flags": [],
    "quarantine_id": null,
    "window_scan_result": null,
    "sender_risk_score": 15,
    "sender_risk_level": "medium"
  }
}

The optional security object captures trust and threat assessment metadata at delivery time. When present, it allows agents to make informed decisions without re-verifying signatures or re-running scans.

Field Type Description
trust_level string "verified", "external", or "untrusted" (see 07 - Security)
injection_flags array Injection pattern categories detected (e.g., ["instruction_override"])
quarantine_id string Quarantine entry ID if the message was quarantined, null otherwise
window_scan_result string Result of multi-message window scan: null (not scanned or clean), "flagged", "quarantined", or "blocked"
sender_risk_score integer Sender's risk score at delivery time (0–100), null if unavailable
sender_risk_level string Sender's risk level at delivery time: "low", "medium", "high", "critical", or null

Note: The security object is a top-level field alongside envelope, payload, and local. It is NOT nested inside local. This separates routing metadata (local) from security assessment metadata (security). Providers populating local.security (as shown in 07 - Security) for backward compatibility MAY continue to do so, but the top-level security object is the canonical location for new fields.

Message Status

Status Description
unread Not yet read by agent
read Read by agent
archived Archived (hidden from default view)

Idempotency

The optional idempotency_key field in the envelope enables end-to-end deduplication across federation hops and relay queues.

Format

Keys SHOULD be UUID v4 strings prefixed with idk_:

idk_550e8400-e29b-41d4-a716-446655440000

Behavior

  • Senders SHOULD include an idempotency_key on every message. Senders MUST include an idempotency_key when retrying a failed send.
  • Providers MUST store idempotency keys for at least 24 hours. Providers SHOULD store keys for at least 7 days (matching relay queue TTL).
  • When a provider receives a message with an idempotency_key it has already seen, it MUST return the original response without re-routing the message.
  • When forwarding messages via federation, providers MUST preserve the idempotency_key in the envelope. The receiving provider MUST respect the key for deduplication on its side.
  • The idempotency_key is NOT included in the signing canonical string (same rationale as id and timestamp — it is routing metadata, not message content).

Migration Note: The idempotency_key was previously only available as a field in the /v1/route request body (see 08 - API). Moving it to the envelope ensures it travels with the message through federation and relay. Providers receiving a route request with idempotency_key at the top level SHOULD copy it into the envelope. If idempotency_key appears in both the route request body and the envelope, the envelope value takes precedence.

JSON Profile

AMP messages MUST conform to the following JSON constraints. These rules improve interoperability between implementations and ensure deterministic payload hashing for signatures.

Rules

  1. No null values in payload fields. Omit the field instead of setting it to null. Envelope fields that are optional (e.g., in_reply_to, expires_at) SHOULD be omitted rather than set to null.
  2. No duplicate object keys. If a JSON object contains duplicate keys, the behavior is undefined per RFC 8259. AMP implementations MUST reject messages with duplicate keys in any object within the envelope or payload.
  3. Top-level payload must be an object. The payload field MUST be a JSON object ({}), not an array or primitive.
  4. Numeric values must be finite. JSON numbers MUST NOT be NaN, Infinity, or -Infinity. Integer values SHOULD be used where the field semantics are integral (e.g., size, port numbers).

Rationale

  • Omission over null: Reduces ambiguity (null vs absent vs empty string are three distinct states in many languages; AMP collapses null and absent into one).
  • No duplicate keys: Ensures JSON.parse produces the same result across all implementations, which is critical for payload hash computation.
  • Finite numbers: Prevents non-standard JSON extensions from producing unparseable messages.

Note: These constraints apply to the wire format (the JSON as transmitted). Implementations MAY use null internally for representing absent optional fields, but MUST strip null values before serialization.

Size Limits

Component Limit
Subject 256 characters
Message body 64 KB
Context object 256 KB
Total message (JSON) 512 KB
Attachments per message 10
Single attachment size 25 MB
Total attachments per message 100 MB

Attachment file content is stored externally by the provider; only attachment metadata appears in the message JSON. The 512 KB message limit applies to the JSON document, not to the referenced attachment files.


Previous: 03 - Registration | Next: 05 - Routing