ARDA Platform

Documentation

ARDA is the AI-native research discovery engine. Integration surfaces alongside human guides—everything organized so AI agents and research teams can connect to the same governed, agent-native platform.

MCP integrations
Human guides & references

Quick links

What is ARDA?

ARDA is a research discovery engine built by Vareon. It combines four discovery modes — Symbolic, Neural, Neuro-Symbolic, and Causal — under a single governed platform so teams can move from raw observations to defensible scientific claims without switching tools mid-project.

The platform serves two audiences with equal depth. Integration builders connect through the Model Context Protocol, the REST API, the Python SDK, or the CLI. Human researchers use the same underlying engine through guided workflows, campaign management, and governance controls that make evidence traceable from exploration through publication.

Every interaction—whether initiated by an external service or a researcher in a notebook—passes through the same governance stack: the Truth Dial for maturity classification, the Evidence Ledger for immutable provenance, and Negative Controls for falsification checks. This shared backbone is what makes ARDA a governed engine rather than a collection of disconnected utilities.

ARDA documentation hub overview showing the platform architecture

Discovery architecture

Four discovery modes, one governed pipeline

ARDA does not prescribe a single discovery method. Instead, it offers four complementary modes that can be composed within a single campaign or run, depending on the nature of the data and the research question.

Symbolic

Symbolic discovery operates over structured representations—ontologies, knowledge graphs, and formal rule systems. It excels when domain expertise can be encoded as explicit constraints and when the search space is well-defined. Symbolic runs produce claims grounded in logical entailment, making their provenance chain straightforward to audit. This mode is particularly suited to regulatory domains where traceability matters as much as the finding itself.

Neural

Neural discovery uses learned representations to surface patterns in high-dimensional data that resist hand-crafted rules. It is the right starting point for exploratory work on large, heterogeneous datasets where the structure of the answer is not yet known. Neural runs generate candidate claims that typically require downstream validation—either through governance promotion or by pairing with a symbolic or Neuro-Symbolic pass for confirmation.

Neuro-Symbolic

Neuro-Symbolic discovery combines learned pattern recognition with symbolic reasoning in a single pass. The neural component proposes candidate structures, and the symbolic component prunes or refines them against domain constraints. This hybrid approach is useful when working with semi-structured data—rich enough for neural methods to find signal, but governed by known invariants that symbolic layers can enforce.

Causal Mode (CDE)

Causal mode answers intervention-style questions with directional claims, confidence qualifiers, and explicit scope—so teams can reason about what drives what, not only what correlates. It is especially relevant in pharmaceutical, environmental, and policy-adjacent research where downstream decisions depend on mechanism-oriented evidence.

ARDA discovery pipeline showing how data flows through the four discovery modes

The governance stack

Governance in ARDA is not an afterthought bolted onto a discovery engine—it is the structural layer that makes discovery outputs trustworthy. Three components work together to ensure that every claim carries its full provenance chain from raw data through to external publication.

Truth Dial

The Truth Dial classifies every claim into maturity tiers: Explore, Validate, and Publish. Claims begin in the Explore tier, where they represent preliminary observations. As evidence accumulates and validation checks pass, claims can be promoted through tiers. Each promotion event is recorded with its rationale, reviewer identity, and the evidence snapshot that justified it. The Publish tier represents claims that have met your organization's bar for external communication.

Evidence Ledger

The Evidence Ledger is an append-only record of every significant event in a discovery session: data ingestion, mode selection, parameter choices, intermediate results, promotion decisions, and negative control outcomes. Entries are content-addressed, meaning any tampering with historical records is detectable. The ledger serves both compliance and reproducibility—an auditor can reconstruct the path from raw data to published claim without access to the original runtime.

Negative Controls

Negative Controls are falsification checks that run alongside or after discovery passes. They test whether the observed patterns survive when key assumptions are relaxed, when data subsets are permuted, or when alternative explanations are fed into the same pipeline. A claim that has passed negative controls carries stronger evidence than one that has only been confirmed positively. The platform records which controls were run, their outcomes, and whether any controls were skipped along with the reason for the omission.

Access surfaces

Four ways to connect

ARDA exposes the same governed operations through multiple access surfaces. The choice depends on your integration pattern, not on feature availability—every surface reaches the same engine.

REST API

Resource-oriented HTTP endpoints for services, legacy orchestrators, and any client that speaks HTTP. Predictable pagination, filtering, and typed claim payloads.

Python SDK

Ergonomic sync and async clients with typed models for notebooks, pipelines, and services. Handles retries, pagination, and idempotency patterns internally.

MCP

Model Context Protocol integration for external services that need discoverable tool surfaces, structured invocations, and session-aware research continuity.

CLI

Terminal-first access for operators in clusters, CI environments, and HPC systems. Structured output formats and scriptable exit codes for automation.

For Integrations

MCP, manifests, sessions, and policies—designed so external agents and services can connect to ARDA the same way they connect to any other first-class tool.

MCP Tools

A broad tool surface exposed through the Model Context Protocol so agents can profile data, launch discovery passes, pull ledger entries, and compose multi-step workflows without bespoke glue code.

Well-Known Endpoints

Standard discovery locations for capabilities, manifests, and integration metadata—so new agents attach to the right tenant surfaces without stale onboarding wikis.

Agent Sessions

Persistent sessions that treat research as continuity: state, budgets, tried strategies, and promotion context carried across invocations instead of one-off chat turns.

Autonomy Policies

Per-project and per-campaign caps on tools, spend, and promotion—explicit, versioned guardrails that keep exploration fast without silent escalation.

Capability Manifests

Typed manifests describing what each surface can do—so orchestrators and agents negotiate integrations at the semantic layer, not by reverse-engineering endpoints.

Agent SDK

Patterns and primitives for building agent clients against ARDA—session lifecycle, tool invocation, and structured handling of typed scientific claims.

For Humans

Guides, references, and operational playbooks for the people who set objectives, approve promotions, and sign the compliance narrative.

Getting Started

First project, first dataset, first governed run—orient quickly, then deepen into modes, controls, and promotion when your team is ready.

API Reference

REST resources for projects, campaigns, runs, artifacts, and governance metadata—auth models, pagination, and how claims surface in API responses.

Python SDK

ArdaClient and AsyncArdaClient with typed models for the same operations as the API—ideal for notebooks, pipelines, and services that need deterministic retries.

CLI Guide

Command-line access for automation-friendly workflows—authentication, scripting patterns, and machine-readable output for CI and HPC environments.

Campaigns & Runs

How batch and longitudinal programs map to campaigns, runs, and ledger entries — so operations teams align with research leads on what completion means.

Governance & Truth Dial

Explore, validate, and publish tiers; negative controls; claim promotion; and how publish bundles freeze context for external partners.

How to use these docs

If you are building an integration

Start with the Agent Integration overview to understand sessions, policies, and manifests. Then choose your primary access surface: MCP for interactive tool use and session-aware research, or the REST API for stateless service-to-service calls. The Python SDK wraps the REST layer with typed models and retry logic for pipeline use cases.

Integration builders should also review the governance documentation to understand how autonomy policies constrain what their integration can do—and how to request expanded permissions when a project requires them.

If you are a researcher or operator

Start with Human documentation for guided learning paths that walk through your first governed run, campaign setup, and claim promotion workflow. When you are ready to automate, the CLI guide covers terminal workflows and the SDK reference covers notebook and pipeline integration.

Researchers who need to explain governance to external partners—regulatory reviewers, funding bodies, or collaborative institutions—should pay particular attention to the governance storytelling section in the human docs, which covers how to present Truth Dial tiers and ledger provenance to non-technical audiences.