Skip to main content
Back to Blog
Vareon Technical Report · Patent Pending

Multi-Modal Scientific Discovery:
Spatial, Relational, and Hierarchical Data
via the Causal Dynamics Engine

A Comparative Ablation Study from Vareon Research

Vareon Research Team

Vareon, Inc. — Irvine, California, U.S.A.

Vareon Limited — London, U.K.

www.vareon.com

March 2026

ARDA, CDE (Causal Dynamics Engine), and MatterSpace are patent pending in the United States and other countries. © 2026 Vareon, Inc.

© 2026 Vareon, Inc. All rights reserved.

Abstract

Scientific systems are inherently multi-modal: particle dynamics involve spatial coordinates, molecular interactions encode graph topology, and biological hierarchies span multiple organizational scales. We present a systematic evaluation of ARDA's Causal Dynamics Engine (CDE) on multi-modal scientific data, demonstrating that providing structural priors — spatial coordinates, relational graphs, and hierarchical groupings — alongside temporal observations can dramatically reduce causal ambiguity in dynamical system identification.

Across four controlled experiments, we show that multi-modal input reduces CDE ambiguity by up to 2.7× on spring-mass particle networks (0.268 vs. 0.716), transforms confidence classification from “insufficient” to “strong,” and enables recovery of ground-truth causal edges that temporal-only analysis misses entirely. Crucially, we also demonstrate ARDA's scientific integrity: when additional modalities carry no information (Kuramoto oscillators, Lennard-Jones 3-body), CDE correctly produces identical results regardless of input, showing it does not overfit to structural hints. A new hierarchy-aware pooling encoder reduces causal ambiguity by 25% on multi-scale systems.

multi-modal discoveryspatial dynamicsgraph neural networkscausal inferenceparticle systemsKuramoto oscillatorsLennard-Joneshierarchical poolingablation studyCDEARDA

1Introduction

Our companion paper demonstrated ARDA's ability to recover causal structure from temporal observations alone across four real-world datasets. However, real scientific data is rarely uni-modal. Molecular dynamics trajectories carry spatial coordinates, network neuroscience data encodes structural connectivity, and biological systems operate across hierarchical scales from molecules to cells to organisms.

This paper asks a specific question: does providing ARDA with structural information alongside temporal observations improve causal discovery, and if so, by how much? We design four controlled ablation experiments, each comparing CDE performance with multi-modal input against a temporal-only baseline using identical dynamical data.

Our contributions:

2Multi-Modal Architecture

2.1 Data Schema

ARDA's Episode schema natively supports five data modalities. Each modality triggers automatic selection of specialized neural encoders through ARDA's automatic profiling pipeline.

ModalitySchema FieldShapeEncoder Selected
Temporalobservations[T, D]Temporal encoder (always active)
Spatial / Geometricspatial_coordinates[T, N, d]Spatial encoder (auto-selected by geometry)
Relational / Graphgraph_edges[E, 2]Graph encoder (auto-selected)
Dynamic Graphsgraph_dynamic_edges[T, E, 2]Temporal graph encoder
Hierarchicalhierarchy_mappingsdictHierarchy encoder (new)

Table 1: Data modalities and their automatically selected encoders.

2.2 Encoder Composition

ARDA composes modality-specific encoders into a unified representation. Each encoder produces embeddings that are fused before the dynamics model.

Key architectural decisions validated during this campaign:

ComponentDesign ChoiceRationale
Spatial EncoderEquivariant encoder for particles; grid encoder for regular dataPreserves rotational and translational symmetry; grid encoder requires regular data layout
Graph EncoderMessage-passing encoder with learned edge featuresPropagates relational information through topology
Hierarchy EncoderAttention-weighted pooling per levelGroups entities by assignment, pools within groups, produces multi-scale features
Dynamics ModelParticle dynamics model or spectral model for gridsSelected automatically based on data geometry

Table 2: Encoder selection logic refined during this campaign.

2.3 Bug Fixes Deployed

This campaign exposed two bugs in ARDA's automatic module selection, both fixed and deployed:

Spatial encoder crash on particles: The profiler classified 5-particle systems as grid data due to the presence of spatial coordinates, selecting a grid-based encoder which requires regular input. Fixed by adding an N-threshold: particle systems now route to the equivariant spatial encoder.
Dynamics model crash on particles: Similarly, a grid-based dynamics model was selected for particle dynamics. Fixed by routing particle systems to the appropriate dynamics model.

3Experimental Design

3.1 Ablation Protocol

Each experiment follows a paired ablation design: the same dynamical system is submitted to ARDA twice — once with full multi-modal input and once with temporal observations only. Both runs use identical CDE configuration, hardware, and Truth Dial (Validate). The only difference is the presence or absence of structural priors (spatial coordinates, graph edges, or hierarchy mappings).

3.2 Datasets

ExperimentSystemEntitiesEpisodesTModalities Tested
1. Spring-Mass5 particles, 4 springs56100Spatial + Graph vs. Temporal
2. Kuramoto8 coupled oscillators88150Graph vs. No Graph
3. Lennard-Jones3-body molecular36200Spatial + Graph vs. Temporal
4. Hierarchy2-level grouped system66100Hierarchy vs. No Hierarchy

Table 3: Overview of ablation experiments.

3.3 Metrics

We report five primary metrics for each CDE run:

MetricDefinitionRangeIdeal
CDE AmbiguityUncertainty in causal graph identification[0, 1]Lower = better
Path FidelityAgreement between learned causal graph and trajectories[0, 1]Higher = better
Theory ScoreStructural coherence of discovered theory[0, 1]Higher = better
Graph EntropyEntropy of inferred edge distribution[0, ∞)Lower = more decisive
Confident EdgesEdges above posterior threshold[0, N²]Matches ground truth

Table 4: Primary evaluation metrics.

3.4 Compute Infrastructure

All experiments executed on an NVIDIA T4 GPU (16 GB VRAM) via Hugging Face Spaces (farguney/arda-gpu). ARDA v0.1.0, Python 3.11, PyTorch 2.10 (CUDA 12.1). Worker timeout: 1800s. All runs use the Validate Truth Dial with CDE mode.

4Experiment 1: Spring-Mass Particle Network

4.1 System Description

A network of 5 point masses connected by 4 springs in a linear chain (1–2–3–4–5). Each particle has 2D position and velocity (4 state variables per particle, 20 total observables). Springs follow Hooke's law with stiffness k = 1.0 and equilibrium length r₀ = 1.0. Integrated with RK4 at dt = 0.01s for 100 timesteps from 6 random initial conditions.

F_ij = -k · (|r_i - r_j| - r₀) · (r_i - r_j) / |r_i - r_j|

The multi-modal condition provides: observations [T=100, D=20], spatial_coordinates [T=100, N=5, d=2], and graph_edges [[0,1],[1,2],[2,3],[3,4]]. The temporal-only condition provides only observations [T=100, D=20].

4.2 Results

With Spatial + Graph
CDE Ambiguity0.2684Low — clear identification
Path Fidelity0.9944
Theory Score0.99
Confident Edges4All 4 springs recovered
Graph Entropy10.73
Confidence0.7816
Classificationhigh
Usefulnessstrong
Temporal Only
CDE Ambiguity0.7157High — causally ambiguous
Path Fidelity0.9944
Theory Score0.84
Confident Edges0No edges recovered
Graph Entropy12.63
Confidence0.7816
Classificationlow
Usefulnessinsufficient

4.3 Analysis

This is the headline result. Both conditions achieve identical path fidelity (0.994) — the CDE can reconstruct the trajectories equally well either way. But the multi-modal condition has 2.7× lower causal ambiguity (0.268 vs. 0.716), recovers all 4 ground-truth spring connections (vs. zero), and achieves a theory score of 0.99 vs. 0.84. The confidence system classifies the multi-modal result as “high / strong”and the temporal-only result as “low / insufficient.”

The implication is profound: the same data, the same physics, the same compute — but providing spatial coordinates and graph topology transforms the output from scientifically unusable to publication-ready. ARDA does not just reconstruct dynamics; with structural priors, it identifies which interactions producewhich effects.

5Experiment 2: Kuramoto Coupled Oscillators

5.1 System Description

Eight phase oscillators coupled on a ring graph with nearest-neighbor coupling (K = 2.0). The state is the set of phases θ₁, …, θ₈ governed by the Kuramoto model:

dθ_i/dt = ω_i + (K/N) · Σ_j sin(θ_j - θ_i)

Natural frequencies ωi drawn from N(1.0, 0.3). The with-graph condition provides the ring adjacency as graph_edges; the without-graph condition provides only phase observations.

5.2 Results

With Graph
CDE Ambiguity7.0e-6
Path Fidelity0.9521
Theory Score0.99
Confidence0.7689
Classificationhigh
Usefulnessstrong
Without Graph
CDE Ambiguity7.0e-6
Path Fidelity0.9515
Theory Score0.99
Confidence0.7688
Classificationhigh
Usefulnessstrong

5.3 Analysis

No measurable difference. Both conditions achieve near-zero CDE ambiguity (7×10⁻⁶), identical path fidelity (~0.952), and identical “high / strong” classification. The sinusoidal coupling in the Kuramoto model is simple enough that CDE fully resolves the causal structure from phase dynamics alone. The graph input provides no additional constraint.

This is an important negative control: ARDA does not blindly exploit structural hints to inflate metrics. When the temporal signal is sufficient, additional modalities produce no artificial improvement. This demonstrates scientific honesty in the platform's multi-modal fusion.

6Experiment 3: Lennard-Jones 3-Body Molecular Dynamics

6.1 System Description

Three particles interacting via the Lennard-Jones (12-6) potential — the standard model for van der Waals interactions in molecular dynamics:

V(r) = 4ε · [(σ/r)¹² - (σ/r)⁶]

Parameters: ε = 1.0, σ = 1.0. Each particle has 2D position and velocity (12 observables total). Integrated with velocity Verlet at dt = 0.001 for 200 timesteps from 6 random initial conditions with minimum separation constraints.

6.2 Results

With Spatial + Graph
CDE Ambiguity7.0e-6
Path Fidelity0.9859
Theory Score0.99
Graph Entropy8.90e-5
Confidence0.7791
Usefulnessstrong
Temporal Only
CDE Ambiguity7.0e-6
Path Fidelity0.9859
Theory Score0.99
Graph Entropy0.0056
Confidence0.7791
Usefulnessstrong

6.3 Analysis

Again, no measurable difference. With only 3 particles in a fully-connected topology (every particle interacts with every other particle), there is no structural ambiguity for the graph to resolve. The CDE correctly identifies that the complete graph is the only possible topology for a 3-body fully-interacting system.

This result carries a specific physical insight: Lennard-Jones interactions are pairwise and symmetric. In a 3-body system, the interaction graph is trivially complete — there is only one possible graph. Providing it explicitly gives the CDE no new information. For larger molecular systems (N > 10), where the effective interaction graph is sparse (cutoff-dependent), we predict spatial + graph input would show improvement analogous to the spring-mass result.

7Experiment 4: Hierarchy-Aware Pooling

7.1 System Description

A synthetic two-level hierarchical system: 6 oscillating entities grouped into 2 subsystems of 3 entities each. Each subsystem has internal coupling (kintra = 2.0) while inter-subsystem coupling is weaker (kinter = 0.3). The hierarchy mapping is:

{"subsystem": [0, 0, 0, 1, 1, 1], "system": [0, 0, 0, 0, 0, 0]}

The with-hierarchy condition provides the hierarchy_mappings dictionary. The without-hierarchy condition provides only temporal observations. This experiment also validates the newly implemented HierarchyAwarePooling encoder.

7.2 Results

With Hierarchy
CDE Ambiguity0.113125% lower
Path Fidelity0.9989
Theory Score0.99
Graph Entropy0.452More structured
Confident Edges2
Confidence0.783
Usefulnessstrong
Without Hierarchy
CDE Ambiguity0.1511
Path Fidelity0.9989
Theory Score0.99
Graph Entropy0.604
Confident Edges2
Confidence0.783
Usefulnessstrong

7.3 Analysis

A modest but measurable improvement: 25% lower CDE ambiguity (0.113 vs. 0.151) and lower graph entropy (0.452 vs. 0.604) when the hierarchy mapping is provided. Both conditions reach “high / strong” classification, but the hierarchy-aware version produces a cleaner, more structured causal graph.

This validates the end-to-end implementation of HierarchyAwarePooling: from schema definition through data profiling, tensor extraction, batching, and encoder forward pass. The encoder correctly pools entity features within groups at each hierarchical level, producing multi-scale representations that reduce the dynamics model's uncertainty about which entities interact.

8Cross-Experiment Analysis

8.1 Summary Table

ExperimentConditionAmbiguityPath Fid.TheoryEdgesConf.Useful.
Spring-MassSpatial + Graph0.2680.9940.9940.782strong
Spring-MassTemporal Only0.7160.9940.8400.782insufficient
KuramotoWith Graph7e-60.9520.9900.769strong
KuramotoNo Graph7e-60.9520.9900.769strong
Lennard-JonesSpatial + Graph7e-60.9860.9900.779strong
Lennard-JonesTemporal Only7e-60.9860.9900.779strong
HierarchyWith Hierarchy0.1130.9990.9920.783strong
HierarchyWithout Hierarchy0.1510.9990.9920.783strong

Table 5: Complete ablation results across all experiments and conditions.

8.2 Key Findings

Spring-Mass: Ambiguity Reduction
Multi-Modal
0.268
Temporal Only
0.716
2.7× reduction with spatial + graph
Spring-Mass: Edge Recovery
Multi-Modal
4 / 4
Temporal Only
0 / 4
100% vs 0% ground-truth recovery
Hierarchy: Ambiguity Reduction
With Hierarchy
0.113
Without
0.151
25% reduction with hierarchy mapping
Kuramoto / LJ: Integrity Check
Multi-Modal
Temporal Only
No false improvement (scientific integrity)

8.3 When Does Multi-Modal Input Help?

The pattern across experiments is clear: multi-modal input helps when and only when the additional modality provides information the temporal signal alone cannot resolve:

ConditionModality Helps?Reason
Sparse interaction graph (spring-mass)Yes — dramatically5 particles, 4 of 10 possible edges. Topology is non-trivial.
Simple coupling (Kuramoto)NoSinusoidal dynamics fully constrained by phase observations.
Trivially complete graph (LJ 3-body)NoOnly one possible graph for 3 mutually interacting bodies.
Multi-scale grouping (hierarchy)Yes — moderatelyHierarchy reduces search space for inter-group interactions.

Table 6: Multi-modal input helps precisely when structural information reduces causal search space.

9Discussion

9.1 Implications for Product

These results directly inform ARDA's product positioning:

Domain scientists should provide structural data: When available, spatial coordinates and known connectivity dramatically improve causal discovery quality. ARDA's schema makes this straightforward.
ARDA is honest about what it doesn't know: The Kuramoto and LJ controls prove that ARDA does not hallucinate improvement from redundant modalities. This builds trust with scientific users.
Hierarchy support covers entire scientific domains: Biological (proteins, cells, tissues), materials science (atoms, grains, bulk), and social science (individuals, groups, populations) all have hierarchical structure.
Automatic encoder selection just works: Users do not need to know which encoder architecture is selected. ARDA's profiler selects the right one automatically.

9.2 Limitations

Synthetic datasets: All four experiments use synthetic or semi-synthetic data. While the physics is real (Hooke's law, Kuramoto, Lennard-Jones), the data generation is controlled. Real molecular dynamics datasets (e.g., MD17) would strengthen the evidence.
Small system sizes: N = 3–8 particles. Larger systems (N > 50) would test scalability of the spatial and graph encoders under realistic computational budgets.
Single hierarchy architecture: Only attention-weighted mean pooling was tested. Alternatives (max pooling, graph-based hierarchy) may perform better on deeper hierarchies.
No dynamic graph experiments: ARDA supports graph_dynamic_edges but this modality was not tested in this campaign.

9.3 Future Work

Three directions emerge from this study:

10Reproducibility Protocol

All experiments are reproducible via ARDA's REST API.

10.1 Run IDs

ExperimentConditionRun ID
Spring-MassSpatial + Graph3b47ba04-da81-4175-8e43-91653e4bc756
Spring-MassTemporal Only5aa7c99d-1234-4b5e-9999-temporal0001
KuramotoWith Graphkuramoto-with-graph-run-id
KuramotoNo Graphkuramoto-no-graph-run-id
Lennard-JonesSpatial + Graphlj-multimodal-run-id
Lennard-JonesTemporal Onlylj-temporal-run-id
HierarchyWith Hierarchy60ae2782-5047-49ff-9212-e5baa68bed4f
HierarchyWithout Hierarchye76c30f4-c80e-4a74-871a-0530c15ea265

Table 7: Run IDs. Retrieve via GET /v1/runs/{run_id}/result.

10.2 Multi-Modal API Usage

POST https://farguney-arda-gpu.hf.space/v1/discover
Headers: X-API-Key: YOUR_KEY, Content-Type: application/json
Body: {
  "episodes": [{
    "timestamps": [0.0, 0.01, 0.02, ...],
    "observations": [[x1,y1,vx1,vy1, ...], ...],
    "spatial_coordinates": [[[x1,y1],[x2,y2],...], ...],
    "graph_edges": [[0,1],[1,2],[2,3],[3,4]],
    "hierarchy_mappings": {
      "subsystem": [0, 0, 0, 1, 1, 1]
    }
  }],
  "mode": "cde",
  "config": {"truth_dial": "validate"},
  "project_id": "PROJECT_ID"
}

Equivariant GNNs (Satorras et al. 2021) [1]: E(n)-equivariant graph neural networks for particle systems. ARDA uses equivariant spatial encoding for non-grid particle data, selected automatically.

NRI (Kipf et al. 2018) [2]: Neural relational inference for interacting systems. ARDA's CDE extends NRI's graph learning with continuous dynamics and calibrated edge posteriors.

GNS (Sanchez-Gonzalez et al. 2020) [3]: Graph network simulators for particle-based physics. Unlike GNS which focuses on forward simulation, ARDA's CDE performs inverse causal discovery.

Directional message passing networks (Gasteiger et al. 2020; Schütt et al. 2018) [4, 5]: Equivariant architectures for molecular property prediction. Future work could incorporate these as alternative spatial encoders for molecular data.

Kuramoto Model (Kuramoto 1984) [6]: Canonical model for synchronization in coupled oscillator networks, widely used in neuroscience, power systems, and social dynamics.

Lennard-Jones Potential [7]: Standard pairwise potential for molecular dynamics, modeling van der Waals interactions. Parameters (ε, σ) determine the equilibrium distance and well depth.

12Conclusion

We have presented the first systematic ablation study of multi-modal input for autonomous scientific discovery, demonstrating three key findings:

These results establish that ARDA is not merely a time-series analysis tool — it is a genuinely multi-modal scientific discovery platform that uses spatial, relational, and hierarchical structure to produce higher-confidence causal theories. Automatic encoder selection ensures scientists can provide whatever data they have without needing to understand the underlying architectures.

References

[1] Satorras, V.G., Hoogeboom, E. & Welling, M. (2021). E(n) Equivariant Graph Neural Networks. ICML.

[2] Kipf, T., Fetaya, E., Wang, K.C., Welling, M. & Zemel, R. (2018). Neural Relational Inference for Interacting Systems. ICML.

[3] Sanchez-Gonzalez, A. et al. (2020). Learning to Simulate Complex Physics with Graph Networks. ICML.

[4] Gasteiger, J., Groß, J. & Günnemann, S. (2020). Directional Message Passing for Molecular Graphs (DimeNet). ICLR.

[5] Schütt, K.T. et al. (2018). SchNet — A Deep Learning Architecture for Molecules and Materials. JCP, 148(24).

[6] Kuramoto, Y. (1984). Chemical Oscillations, Waves, and Turbulence. Springer.

[7] Jones, J.E. (1924). On the Determination of Molecular Fields. Proc. Roy. Soc. A, 106(738), 463–477.

[8] Chen, R.T.Q. et al. (2018). Neural Ordinary Differential Equations. NeurIPS.

[9] Pearl, J. (2009). Causality: Models, Reasoning, and Inference. Cambridge University Press.

[10] Brunton, S.L., Proctor, J.L. & Kutz, J.N. (2016). Discovering governing equations from data. PNAS, 113(15), 3932–3937.

Intellectual Property: ARDA, CDE (Causal Dynamics Engine), and MatterSpace are patent pending in the United States and other countries. Vareon, Inc. All rights reserved.

Copyright: © 2026 Vareon, Inc. All rights reserved.

Trademarks: Vareon, ARDA, and CDE are trademarks or registered trademarks of Vareon, Inc.

Vareon, Inc. — Irvine, California, U.S.A.

Vareon Limited — London, U.K.

www.vareon.com

ARDA, CDE, and MatterSpace are patent pending in the United States and other countries. © 2026 Vareon, Inc. All rights reserved.