600 candidates. 23 dopant elements. Zero target knowledge. Both Re₁@Ni and Ir₁@Ni catalysts blindly rediscovered to sub-angstrom accuracy. Every candidate valid by construction.
0.408 Å
Full RMSD
Level C PASS
97.5–99%
Structural validity
By construction
~$15
Cloud cost
Single A100, 4.7 hrs
Vareon Research
Vareon Inc. · Vareon Limited · March 2026
Materials discovery today follows a propose-then-filter paradigm. Generate millions of candidate structures randomly or through exhaustive enumeration. Evaluate them with expensive density functional theory (DFT) or machine-learned interatomic potentials. Discard the vast majority because they are physically invalid or chemically unreasonable.
The waste is staggering. Most compute cycles are spent evaluating structures that should never have been proposed. GNoME identified 2.2 million stable crystals through brute-force screening. MatterGen generates crystals with diffusion but filters for validity afterward. Open Catalyst built massive datasets for screening. CDVAE, DiffCSP, FlowMM — all rely on post-hoc filtering for physical validity.
None of these approaches can guarantee that a generated structure satisfies physical, chemical, and geometric constraints during generation. None have demonstrated blind rediscovery — starting from zero knowledge and independently finding known materials to sub-angstrom accuracy.
MatterSpace embeds physical, chemical, and geometric constraints directly into the generation process. Goals and constraints are not applied after generation — they are enforced at every single step, resulting in near-guaranteed validity every time.
Every structural constraint — minimum interatomic distances, coordination bounds, surface height limits — is enforced during generation, not after. The engine does not produce invalid candidates and filter them out. It produces valid candidates from the start.
The engine autonomously navigates complex energy landscapes, balancing exploration of new configurations with refinement of promising candidates. No manual scheduling or hand-tuning required.
Post-generation refinement with high-accuracy interatomic potentials improves structural precision without modifying the core generative architecture. The refinement calculator is modular and upgradeable.
The generation engine is universal. Only the domain pack changes — the constraints, objectives, and physics specific to each scientific field. One engine. Every domain.
Why this is different
Current generative AI suffers from the generate-and-filter approach: produce candidates blindly, then discard the invalid ones. MatterSpace eliminates this waste entirely. The engine generates with goals and constraints baked in, resulting in near-guaranteed validity every time. This is not a marginal improvement — it is a fundamentally different paradigm for generative AI.
MatterSpace was tested through a blind rediscovery experiment: starting from a palette of 23 dopant elements with zero target information, the engine had to independently generate candidates that match known Re₁@Ni and Ir₁@Ni single-atom alloy catalysts for methane cracking. A three-level post-hoc validation protocol measured performance, structural motif similarity, and exact geometric accuracy.
Primary Metric
581 / 600 candidates below -1.3 eV
Key Finding
Best adsorption energy: -34.73 eV
The vast majority of generated candidates exhibit strongly favorable surface binding, confirming that MatterSpace generates catalytically relevant materials — not random structures.
Primary Metric
75 matches, best similarity 0.814
Key Finding
Both Re and Ir independently identified from 23 elements
Without any target information, the engine correctly identified both target dopant elements from a 23-element palette. The probability of randomly selecting both correct elements is 0.19%.
Primary Metric
0.408 Å full RMSD (metal-only: 0.363 Å)
Key Finding
Both targets independently below 0.5 Å threshold
Ir₁@Ni at 0.408 Å and Re₁@Ni at 0.466 Å — both independently rediscovered to sub-angstrom precision. This is the first demonstration of blind generative material rediscovery achieving all three validation levels for surface catalysts.
Across all generation steps, constraints were satisfied at near-perfect rates with negligible computational overhead. The constraint enforcement mechanism adds less than 4% to total generation time while guaranteeing structural validity at every step.
MatterSpace is the only system achieving all three validation levels. Existing generative models demonstrate Level A capability (favorable properties) but have not demonstrated Level B (motif matching) or Level C (sub-angstrom structural reproduction) — because they are not designed for blind rediscovery.
| System | A | B | C | Constraints |
|---|---|---|---|---|
| GNoME | PASS | — | — | Post-hoc |
| MatterGen | PASS | — | — | Post-hoc |
| Open Catalyst | PASS | — | — | Post-hoc |
| CDVAE | PASS | — | — | Post-hoc |
| DiffCSP | PASS | — | — | Post-hoc |
| FlowMM | PASS | — | — | Post-hoc |
| MatterSpace | PASS | PASS | PASS | By construction |
This result was produced by MatterSpace Lattice — the materials discovery engine. But the core architecture that made it possible is domain-agnostic. None of the underlying engine is specific to materials. Only the domain pack changes.
Constraints enforced during generation — valid by construction
Adaptive landscape navigation across complex energy surfaces
Diverse archives of Pareto-optimal candidates, not a single answer
Modular architecture — plug in any model as a refinement calculator
Multi-objective optimization toward user-defined goals
Scientists direct their agents with what they are after. Their agents pick parameters from a parameter pool curated for each pack, set constraints and goals, and MatterSpace begins generating. The models are based on strong open-source foundations — and customers can bring their own models and data too.
But that is rarely the bottleneck. Most open-source models are already strong. What's lacking is strong engineering to steer generation toward the desired solution space with small compute budgets and achieve near 100% validity by construction.
Batteries, catalysts, superconductors, magnets, photovoltaics, thermoelectrics, HEAs, electrolytes, coatings
ADMET constraints, binding affinity objectives, molecular stability. The same constraint enforcement produces valid drug candidates.
Complexity bounds, correctness constraints, optimality objectives. Valid-by-construction algorithms.
Design rule constraints, power-performance-area objectives. Physical layout validity by construction.
Expected impact
The 97.5–99% structural validity demonstrated on materials is expected to bring a new paradigm to generative AI across drug discovery, longevity research, advanced materials, and algorithm design. MatterSpace eliminates the generate-and-filter waste that plagues current generative approaches. When constraints are baked into generation, nearly every GPU cycle produces a viable candidate.
The complete manuscript with full results, validation protocol, and comparison tables.
Read the Paper