Model Context Protocol

The AI-native MCP interface for ARDA

MCP is how AI agents connect to ARDA as a first-class tool provider. One protocol, full access to discovery orchestration, governance, and artifact management — with typed schemas, structured errors, and autonomy policies enforced on every call.

AI-native by design

ARDA exposes every capability — profiling, discovery, governance, artifacts — as typed MCP tools that any compliant client can discover and invoke without custom glue code.

Two transports, one catalog

Local agents connect over stdio for zero-latency development loops. Remote orchestrators use streamable HTTP with SSE for production deployments. The tool surface is identical on both.

Governed by default

Every tool call passes through ARDA's governance stack. Autonomy policies constrain what an agent session can do, and violations return structured errors — not silent failures.

Resources & prompts

Beyond tools, the server exposes documentation resources and workflow prompts so agents can plan effectively without relying on stale training data or hallucinated procedures.

What is the Model Context Protocol?

MCP is an open standard that lets AI models call external tools, read resources, and follow prompts through a single, discoverable interface. Instead of hard-coded REST wrappers per service, an MCP client connects to a server and receives a typed catalog of everything available — tools with JSON-schema inputs and outputs, resources with MIME types, and prompts with parameter templates.

ARDA chose MCP as its primary agent interface because scientific discovery is inherently multi-step and judgment-heavy. An agent needs to profile data, select a mode, launch a run, inspect claims, decide whether to promote — and each step depends on structured output from the last. MCP's typed tool calls and resource reads map directly onto this workflow, eliminating the parsing fragility of free-text API wrappers.

As ARDA adds new tools, any connected agent automatically discovers them at connection time. No SDK update, no code change — the manifest is always current.

Governance shield — MCP tools governed by autonomy policies

Server setup

The ARDA MCP server is distributed as a standalone binary called arda-mcp-server. Install it, choose a transport, and point your client.

# Install via pip (bundles arda-mcp-server)
pip install arda-sdk

# Or download the standalone binary
curl -fsSL https://get.arda.vareon.com/mcp | sh

# Verify
arda-mcp-server --version
# arda-mcp-server 1.4.0 (protocol 2025-03-26)

stdio mode

The client spawns arda-mcp-server as a subprocess and communicates over stdin/stdout using JSON-RPC. Zero network configuration — ideal for local development, IDE integrations, and single-machine pipelines.

arda-mcp-server --transport stdio

Streamable HTTP mode

The server listens on a port and uses HTTP POST for requests and SSE for streaming responses. Supports long-lived sessions, reconnection, and load balancing — recommended for production deployments where agents run on separate infrastructure.

arda-mcp-server --transport http --port 3100

Client configuration

Point your MCP-compatible client at the server. Below are copy-paste configs for the most popular clients.

Cursor IDE

Add to .cursor/mcp.json in your project root:

{
  "mcpServers": {
    "arda": {
      "command": "arda-mcp-server",
      "args": ["--transport", "stdio"],
      "env": {
        "ARDA_API_KEY": "arda_key_..."
      }
    }
  }
}

Claude Desktop

Add to claude_desktop_config.json:

{
  "mcpServers": {
    "arda": {
      "command": "arda-mcp-server",
      "args": ["--transport", "stdio"],
      "env": {
        "ARDA_API_KEY": "arda_key_..."
      }
    }
  }
}

Custom agent (HTTP)

Start the server in HTTP mode and connect with any MCP client library:

# Start the server
ARDA_API_KEY="arda_key_..." arda-mcp-server \
  --transport http --port 3100

# Connect from Python
from mcp import ClientSession
from mcp.client.streamable_http import streamablehttp_client

async with streamablehttp_client(
    "http://localhost:3100/mcp"
) as (read, write, _):
    async with ClientSession(read, write) as session:
        await session.initialize()
        tools = await session.list_tools()
        print(f"{len(tools.tools)} tools available")

Authentication

MCP sessions authenticate through the environment, not inside the protocol. The server reads credentials at startup and binds them to every tool call for the session's lifetime.

API key in environment

Set ARDA_API_KEY before launching the server. The key is scoped to a project and determines the session's governance policy, budget, and allowed tier ceiling.

export ARDA_API_KEY="arda_key_proj01_..."
arda-mcp-server --transport stdio

Session handoff

For platform-managed deployments, the ARDA web UI can launch an MCP session on behalf of a logged-in user. The session token is passed to the server via ARDA_SESSION_TOKEN, inheriting the user's permissions, project context, and autonomy policy. When the token expires, the server rejects new tool calls with a structured auth/expired error.

Tool catalog

Nine tools cover the full discovery lifecycle. Each has a typed JSON-schema input, a structured output, and governance metadata. Tools are grouped by domain — data, discovery, governance, and artifacts.

data.upload

Upload a dataset to the current project. Accepts CSV, Parquet, and HDF5. Returns a dataset ID for downstream operations.

Input

{
  "file_path": "string — local path or presigned URL",
  "name": "string — human-readable dataset name",
  "description": "string (optional) — dataset context"
}

Output

{
  "dataset_id": "ds_29xK4m",
  "name": "pendulum_timeseries",
  "size_bytes": 2048576,
  "rows": 15000,
  "columns": 8,
  "created_at": "2026-03-28T10:00:00Z"
}

data.profile

Run a comprehensive statistical profile on a dataset. Returns column metadata, distributions, quality flags, and recommended discovery modes.

Input

{
  "dataset_id": "string — target dataset ID"
}

Output

{
  "dataset_id": "ds_29xK4m",
  "rows": 15000,
  "columns": [
    { "name": "time", "dtype": "float64", "nulls": 0,
      "min": 0.0, "max": 30.0 },
    { "name": "theta", "dtype": "float64", "nulls": 0,
      "min": -1.57, "max": 1.57 },
    ...
  ],
  "quality": { "score": 0.94, "flags": [] },
  "recommended_modes": ["symbolic", "neuro_symbolic"]
}

discover.run

Submit a discovery run. Accepts one of four modes — symbolic, neural, neuro_symbolic, or cde — with mode-specific parameters. Returns a run ID for polling.

Input

{
  "dataset_id": "string — dataset to analyze",
  "mode": "'symbolic' | 'neural' | 'neuro_symbolic' | 'cde'",
  "parameters": {
    "max_complexity": "number (optional, symbolic)",
    "epochs": "number (optional, neural)",
    "target_columns": ["string"],
    "input_columns": ["string"]
  }
}

Output

{
  "run_id": "run_7fGh2p",
  "status": "queued",
  "mode": "symbolic",
  "dataset_id": "ds_29xK4m",
  "submitted_at": "2026-03-28T10:05:00Z"
}

discover.status

Check the current status of a discovery run. Returns the pipeline stage, progress percentage, and estimated time remaining.

Input

{
  "run_id": "string — run to check"
}

Output

{
  "run_id": "run_7fGh2p",
  "status": "running",
  "stage": "symbolic_regression",
  "progress": 0.62,
  "eta_seconds": 145,
  "stages_completed": ["ingestion", "feature_engineering"],
  "stages_remaining": ["symbolic_regression", "validation"]
}

discover.claims

Retrieve typed claims produced by a completed discovery run. Supports filtering by claim type — equation, causal_graph, conservation_law, or invariant.

Input

{
  "run_id": "string — completed run ID",
  "type_filter": "'equation' | 'causal_graph' | 'conservation_law' | 'invariant' (optional)"
}

Output

{
  "run_id": "run_7fGh2p",
  "claims": [
    {
      "claim_id": "clm_Ax92",
      "type": "equation",
      "expression": "d²θ/dt² = -(g/L)·sin(θ)",
      "fitness": 0.9987,
      "complexity": 5,
      "tier": "explore",
      "scope": { "domain": "θ ∈ [-π/2, π/2]" }
    },
    ...
  ],
  "total": 12
}

governance.promote

Promote a claim to a higher governance tier. Requires a rationale that is recorded immutably in the Evidence Ledger. Subject to autonomy policies.

Input

{
  "claim_id": "string — claim to promote",
  "target_tier": "'validate' | 'publish'",
  "rationale": "string — human or agent justification"
}

Output

{
  "claim_id": "clm_Ax92",
  "previous_tier": "explore",
  "current_tier": "validate",
  "ledger_entry_id": "le_881f",
  "promoted_at": "2026-03-28T10:12:00Z"
}

governance.ledger

Query the Evidence Ledger for governance events. Supports filtering by event type, resource, and time range. Each entry includes a content hash for integrity verification.

Input

{
  "project_id": "string (optional)",
  "event_type": "'promotion' | 'negative_control' | 'policy_violation' (optional)",
  "since": "ISO8601 timestamp (optional)",
  "limit": "number (optional, default 50)"
}

Output

{
  "entries": [
    {
      "entry_id": "le_881f",
      "event_type": "promotion",
      "resource_id": "clm_Ax92",
      "actor": "agent:cursor-session-04a",
      "timestamp": "2026-03-28T10:12:00Z",
      "content_hash": "sha256:9f86d0..."
    },
    ...
  ],
  "total": 34,
  "has_more": false
}

artifacts.list

List artifacts produced by a discovery run — model checkpoints, visualizations, export bundles, and intermediate results.

Input

{
  "run_id": "string — source run ID",
  "type_filter": "'checkpoint' | 'visualization' | 'export' (optional)"
}

Output

{
  "artifacts": [
    {
      "artifact_id": "art_kL3m",
      "type": "visualization",
      "name": "pareto_front.png",
      "size_bytes": 184320,
      "content_hash": "sha256:3c7a2b..."
    },
    ...
  ],
  "total": 6
}

artifacts.download

Download a specific artifact by ID. Returns the artifact content or a presigned download URL for large files.

Input

{
  "artifact_id": "string — artifact to download",
  "format": "'raw' | 'base64' (optional, default 'raw')"
}

Output

{
  "artifact_id": "art_kL3m",
  "name": "pareto_front.png",
  "download_url": "https://storage.arda.vareon.com/...",
  "expires_at": "2026-03-28T11:00:00Z"
}

MCP tool invocations flowing through the ARDA discovery pipeline

Resources

Resources are read-only data that agents can fetch for planning and context. They are versioned alongside the platform and update in real time for session-scoped state.

Documentation

Structured descriptions of discovery modes, governance tiers, and operational procedures. Always matches the tools available in the current session.

arda://docs/modes

arda://docs/governance

arda://docs/faq

Session context

Live state for the current session — active runs, pending claims, budget remaining, and governance events. Updates between tool calls.

arda://session/runs

arda://session/claims

arda://session/budget

Governance state

Current policy constraints in agent-readable terms: allowed tools, reachable tiers, and budget. Read this before attempting promotions to avoid policy violations.

arda://governance/policy

arda://governance/budget

arda://governance/history

Workflow prompts

Prompts are structured templates for common research workflows. An agent requests a prompt and receives a step-by-step plan with parameter placeholders — encoding ARDA domain expertise about effective tool sequencing and governance compliance.

guided-discovery

Profile → select mode → run → inspect claims → promote top candidates. Handles the 80% case for a single-dataset, single-mode workflow with guardrails for each step.

comparative-modes

Run the same dataset through multiple discovery modes, collect claims from each, and produce a comparative summary ranked by fitness and complexity. Useful when the best mode isn't obvious from the profile.

governance-review

Walk through all explore-tier claims in a project, run negative controls, and produce promotion recommendations with rationale. Designed for end-of-campaign audits.

data-quality-audit

Deep profile a dataset, flag quality issues, suggest preprocessing steps, and assess viability for each discovery mode before any runs are submitted.

End-to-end example

A complete session: an agent connects, uploads and profiles a dataset, runs symbolic discovery, inspects the resulting claims, and promotes the best one — all through standard MCP messages.

// 1. Initialize — discover available tools and resources
→ { "jsonrpc":"2.0", "id":1, "method":"initialize",
    "params":{ "clientInfo":{"name":"research-agent","version":"1.0"},
               "capabilities":{} }}
← { "jsonrpc":"2.0", "id":1, "result":{
    "serverInfo":{"name":"arda-mcp-server","version":"1.4.0"},
    "capabilities":{"tools":{},"resources":{},"prompts":{}} }}

// 2. List tools — get typed catalog
→ { "jsonrpc":"2.0", "id":2, "method":"tools/list" }
← { "jsonrpc":"2.0", "id":2, "result":{ "tools":[
    {"name":"data.upload",    "inputSchema":{...}},
    {"name":"data.profile",   "inputSchema":{...}},
    {"name":"discover.run",   "inputSchema":{...}},
    {"name":"discover.status","inputSchema":{...}},
    {"name":"discover.claims","inputSchema":{...}},
    {"name":"governance.promote","inputSchema":{...}},
    {"name":"governance.ledger", "inputSchema":{...}},
    {"name":"artifacts.list",    "inputSchema":{...}},
    {"name":"artifacts.download","inputSchema":{...}}
  ]}}

// 3. Upload dataset
→ { "jsonrpc":"2.0", "id":3, "method":"tools/call",
    "params":{ "name":"data.upload",
      "arguments":{"file_path":"/data/pendulum.csv",
                    "name":"pendulum_timeseries"}}}
← { "jsonrpc":"2.0", "id":3, "result":{"content":[
    {"type":"text","text":"Uploaded pendulum_timeseries → ds_29xK4m (15,000 rows, 8 cols)"}
  ]}}

// 4. Profile the dataset — check quality and get mode recommendations
→ { "jsonrpc":"2.0", "id":4, "method":"tools/call",
    "params":{ "name":"data.profile",
      "arguments":{"dataset_id":"ds_29xK4m"}}}
← { "jsonrpc":"2.0", "id":4, "result":{"content":[
    {"type":"text","text":"Quality: 0.94. 8 columns, 0 nulls. Recommended: symbolic, neuro_symbolic."}
  ]}}

// 5. Run symbolic discovery
→ { "jsonrpc":"2.0", "id":5, "method":"tools/call",
    "params":{ "name":"discover.run",
      "arguments":{"dataset_id":"ds_29xK4m","mode":"symbolic",
                    "parameters":{"target_columns":["theta"],
                                  "max_complexity":8}}}}
← { "jsonrpc":"2.0", "id":5, "result":{"content":[
    {"type":"text","text":"Run submitted: run_7fGh2p. Status: queued."}
  ]}}

// 6. Poll status until complete
→ { "jsonrpc":"2.0", "id":6, "method":"tools/call",
    "params":{ "name":"discover.status",
      "arguments":{"run_id":"run_7fGh2p"}}}
← { "jsonrpc":"2.0", "id":6, "result":{"content":[
    {"type":"text","text":"Stage: symbolic_regression (62%). ETA: 145s."}
  ]}}

// ... poll again ...

← { "jsonrpc":"2.0", "id":7, "result":{"content":[
    {"type":"text","text":"Status: completed. 12 claims produced."}
  ]}}

// 7. Get claims — filter for equations
→ { "jsonrpc":"2.0", "id":8, "method":"tools/call",
    "params":{ "name":"discover.claims",
      "arguments":{"run_id":"run_7fGh2p","type_filter":"equation"}}}
← { "jsonrpc":"2.0", "id":8, "result":{"content":[
    {"type":"text","text":"clm_Ax92: d²θ/dt² = -(g/L)·sin(θ)  fitness=0.9987  complexity=5\nclm_Bx41: θ̈ = -9.81·θ  fitness=0.9812  complexity=3\n..."}
  ]}}

// 8. Promote the best claim to validate tier
→ { "jsonrpc":"2.0", "id":9, "method":"tools/call",
    "params":{ "name":"governance.promote",
      "arguments":{"claim_id":"clm_Ax92","target_tier":"validate",
                    "rationale":"Exact pendulum equation recovered with 0.9987 fitness, physically interpretable terms"}}}
← { "jsonrpc":"2.0", "id":9, "result":{"content":[
    {"type":"text","text":"clm_Ax92 promoted: explore → validate. Ledger entry: le_881f."}
  ]}}

In practice, agents use MCP client libraries that handle JSON-RPC serialization, transport management, and reconnection. The raw messages above illustrate protocol semantics for integration builders.

Data profiling results guiding discovery mode selection

Autonomy policies

Every MCP session operates under an autonomy policy that constrains what the agent can do. Policies are set by project administrators and bound to the session at authentication time.

What policies control

•Tier ceiling — maximum governance tier an agent can promote to (e.g., explore only, or up to validate)
•Compute budget — total GPU-hours or run count allowed per session
•Tool allowlist — which tools the session can invoke (e.g., read-only sessions block governance.promote)
•Mode restrictions — which discovery modes are available

Policy enforcement

Policies are enforced server-side on every tool call. When a call violates a policy, the server returns a structured error — never a silent failure:

{
  "error": {
    "code": -32600,
    "message": "Policy violation",
    "data": {
      "type": "policy/tier_ceiling",
      "detail": "Session limited to explore tier",
      "policy_id": "pol_abc",
      "requested": "validate",
      "allowed": "explore"
    }
  }
}

Error handling

The server uses standard JSON-RPC error codes extended with ARDA-specific error types. Every error includes a data field with machine-readable details so agents can recover programmatically.

Error type	Code	When it happens
policy/tier_ceiling	-32600	Promotion exceeds session's allowed tier
policy/budget_exceeded	-32600	Compute budget exhausted for this session
policy/tool_denied	-32600	Tool not in session allowlist
auth/expired	-32001	Session token or API key expired
auth/invalid	-32001	Missing or malformed credentials
rate_limit	-32002	Too many requests — includes retry_after_ms
resource/not_found	-32602	Dataset, run, claim, or artifact ID not found
invalid_params	-32602	Schema validation failed on tool input

Pair MCP with the CLI for terminal workflows or the Python SDK for programmatic access. All three share the same governed API surface.

CLI reference All docs