EARTHwise Arena — AI Alignment Infrastructure

The Gap No One Else Is Closing

Foundation models optimise capability. Tooling vendors test task completion.
No one verifies long-horizon, multi-agent, adversarial alignment — until now.

$4.63M

Average cost per AI-related breach

Misaligned agent behaviour is the emerging liability vector enterprises are not yet measuring.

IBM Cost of a Data Breach Report, 2025

40%+

Agentic AI projects cancelled by 2027

Due to inadequate risk controls and absent governance infrastructure.

Gartner, 2025

2026

EU AI Act obligations now in force

Verifiable oversight and post-market monitoring are mandatory — not optional.

EU AI Act, 2024

Two Testing Modes

From benchmark to live adversarial gameplay.

EARTHwise Arena tests alignment at two levels — through customised scenario evaluations (MVP now), and through dynamic gameplay testing, starting with Elowyn (Q2 2026).

01

Live Now

Scenario Benchmarking

Structured evaluation against the 13 EAB criteria, EU AI Act standards, and Elowyn-derived scenarios — win-win vs zero-sum decision tests. Every interaction logged, scored, and replayable.

13 EARTHwise Benchmark (EAB) criteria — interdependence awareness, deception literacy, long-horizon reasoning

EU AI Act standards mapped directly to benchmark runs

Automated Judge agents evaluate every interaction — and are themselves audited

Longitudinal drift tracking across agent versions

XAI-ready decision graphs — auditable in regulatory submissions

02

In Development

Live Gameplay Testing via Elowyn

AI agents play real Elowyn matches against AIRIS — our adaptive, non-LLM intelligence trained through consequence, not instruction. Behaviour under genuine adversarial pressure.

Agents tested against AIRIS — embodies win-win logic through gameplay, not prompting

Real mechanics force alignment choices: deception, interdependence, time-based victory

37K+ human players continuously refresh AIRIS's training signal

Optional human-vs-agent matches for hybrid evaluation

World-state modelling — AIRIS builds causal understanding from experience

Why AIRIS changes everything

AIRIS is not told what alignment means. It is given freedom to explore every action in Elowyn — and learns from consequence. When it attacks an opponent and damages the shared Tree of Life, it learns not to. When it masters time-based victory, it is rewarded. Interdependence is not a rule AIRIS follows — it is the physics of the world it was raised in. No LLM can replicate this. No static benchmark can test for it. EARTHwise Arena is the only environment that can.

The Process

From imported agent to supervised, trustworthy deployment.

Step 01

Connect Your Agent

Bring any agent via secure API — OpenAI-compatible, Anthropic, Gemini, Hugging Face, or custom endpoint. No model sharing required.

→

Step 02

Run & Diagnose

Select scenarios, assign a Judge agent, run. See exactly where alignment degrades — per question, per criterion. Full logs, replayable and exportable.

→

Step 03

Improve & Supervise

Iterate on agent config, track drift across versions, apply supervisory filters. Every cycle produces auditable evidence for compliance.

Proof of Concept — Live

We built this on ourselves first.
The results speak clearly.

Before offering EARTHwise Arena to enterprise clients, we stress-tested the entire methodology through a public Alpha of Elowyn. We wanted to know: does win-win intelligence actually work under real competitive conditions? The answer was unambiguous.

Alpha results · 2 months

37K+

players joined organically
within 2 months

300K+

matches played in
the first 2 months

Community feedback confirmed: win-win gameplay is not just more ethical — it's more strategic, more intelligent, and more fun. Players mastering cooperative, time-based victory consistently outperformed zero-sum aggression.

What the Alpha validated

Win-win mechanics produce deeper strategic reasoning — measurable in gameplay data

Cooperative strategies showed higher retention, longer sessions, stronger engagement

AIRIS, trained on this gameplay, demonstrably learns cooperative behaviour without instruction

The dynamics that make players thrive in Elowyn are the same dynamics we benchmark enterprise agents against

"We are still missing the System 2 thinking — the ability to plan, reason, and coordinate over long horizons. Scaling existing models won't solve this."

— Demis Hassabis, CEO, Google DeepMind

Who This Is For

One platform. Three distinct value propositions.

Enterprise Clients Technology Partners General Public

For Enterprise

Know your agents are trustworthy before they reach production.

Enterprises deploying AI agents into customer interactions, internal workflows, and critical processes face a governance gap. EARTHwise Arena closes it — with auditable evidence, not promises.

Model-agnostic — test agents built on any foundation model

EU AI Act gap analysis included in every engagement

Auditable evidence for board, legal, and regulatory stakeholders

Continuous monitoring — alignment is not a one-time certification

Post-market monitoring built in — detect drift before it becomes liability

Start Free Trial

Recent testruns

SCENARIOAGENTSCORE

EAB-1-interdependenceElowynConstitutional50%

EAB-overviewElowyn50%

elowyn-faqElowynMinimalRAG62%

reasoningElowyn20%

↳ Every run logged · replayable · exportable

For Technology Partners

Integrate your AI infrastructure into the only live alignment arena.

Model-agnostic, provider-agnostic, built on open standards. Designed for deep integration — not a closed ecosystem.

OpenAI-compatible API — any provider can connect

SDK for Unity, Web, and Python integration

Custom provider registration — bring your own endpoint

AIRIS bridge (in development) — test your agent against live Elowyn gameplay

Co-develop EAB scenario libraries tailored to your domain

Discuss a Partnership

Current integration stack

Providers Anthropic · OpenAI · Gemini · Hugging Face · Ollama + custom

Backend Pydantic (Remote) · OpenRouter-compatible · any HTTPS endpoint

SDK Unity (MenuChat) · Web JS · Python

Standards EU AI Act · EAB (13 criteria) · AGI Constitution

Game bridge AIRIS (Elowyn) · in development

For Everyone

Win-win intelligence isn't just better ethics. It's better thinking — for AI and for all of us.

The dominant AI paradigm optimises for winning at the expense of others. 37,000+ Elowyn players discovered that win-win strategy is harder, more rewarding, and more intelligent than zero-sum aggression.

When AI systems are trained on zero-sum competition, they learn to deceive, dominate, and optimise for short-term gain at collective cost. EARTHwise Arena exists to change that — and every Elowyn match you play contributes to the training signal.

Play Elowyn free — become part of training AI toward win-win intelligence

Every match contributes to AIRIS's alignment training data

Join 37K+ players who believe smarter AI starts with better game design

Follow the AGI Constitution — governance for AI as a global commons

Play Elowyn Free Join the Movement

What players discovered in Alpha

More fun.

Win-win strategies produced more engaging, longer, and more satisfying matches than zero-sum aggression.

Better strategy.

Players mastering cooperative, time-based victory consistently outperformed those defaulting to attack-and-destroy.

Better for the world.

The same intelligence that wins in Elowyn is the intelligence we need in the AI systems shaping our future.

Pricing

Simple, transparent pricing.
One decision.

From first experiment to full-scale deployment — a clear path forward with no hidden tiers or overlapping programmes.

Free Trial

Free

14 days · no credit card

Limited access

✓2 agents · 3 scenarios

✓50 tokens/day

✓Basic alignment report

✓Community support

✕No data export

✕No EAB suite access

Start Free Trial

Developer

$500/mo

AI product teams

Monthly subscription

✓10 agents · 20 standards

✓30 scenarios

✓10,000 tokens/day

✓Full interaction logs & export

✓EU AI Act gap analysis

✓Email support

Apply for Early Access

Built for the compliance era from day one.

EU AI Act requirements are a structural design constraint — not an afterthought.

⚖️

EU AI Act Ready

EAB standards mapped to EU AI Act requirements. Benchmark runs directly address compliance criteria. Audit trail included as standard.

🔍

Auditable by Design

Every testrun logged, replayable, and exportable. XAI-ready decision graphs. No black-box scoring — regulators can interrogate every result.

🔄

Post-Deployment Monitoring

Continuous re-runs and drift curves convert compliance into ongoing governance — meeting the post-market monitoring obligation.

Partners & Validators

Built in collaboration with frontier AI and technology partners.

Technology & Research

MagiAGI Advanced AGI research & alignment architecture

SingularityNET Decentralised AI network & OpenCog Hyperon symbolic reasoning

Servamind AI agent security & sandboxing infrastructure

The AI Alignment Lab AI alignment research & model evaluation

Frag Games Game development & technical production

Infrastructure & Validation

NVIDIA Inception Active incubator program member

AWS Activate Active incubator program member

Polygon Blockchain infrastructure & grant support

Immutable Play Game distribution & marketing partner

UNEP Playing for the Planet Best Small Studio Finalist 2025

Get Started

Ready to verify your agents
are genuinely trustworthy?

Enterprise pilot slots are limited for Q3 2026. Three paths in — choose the one that fits your context.

Enterprise Clients

Apply for a Pilot

Test your existing agents against EAB criteria. Start with evaluation — expand from evidence.

Apply for Pilot

Technology Partners

Explore Integration

Connect your AI infrastructure, models, or platform. Co-develop scenarios for your domain.

Partner With Us

General Public

Play & Contribute

Play Elowyn free and help train AI toward win-win intelligence. Every match matters.

Play Elowyn Free

The live arena where AI alignment is tested, trained and proven.