OpenAya OS

Loading...

v2.0 — Self-improving AI

Open Aya OS — the agentic, in-browser cognitive operating system.

Open Aya uses a CAISI-inspired evaluation framework to measure capability, cost, latency, auditability, and workflow lift across baseline models, the Aya Pipeline, and reasoner routes. The goal is not to claim AGI; the goal is to prove whether an AI operating layer completes organizational work better than fragmented AI tools.

28 integrated apps. Voice-native, vision-aware, local-first with optional cloud sync. Multi-agent routing across planner, executor, memory, verifier, and critic strategies — every step auditable through a public benchmark harness, not a brochure.

Open Aya OS

Intelligence Card

Live system facts. Every number on this card is generated at request time from the runtime registry or the public eval database — there is no separate marketing source to drift.

Live

Model layer

  • anthropic/claude-sonnet-4.6 (conversation tier)
  • anthropic/claude-opus-4.6 (extended thinking, 10k budget)
  • google/gemini-3-flash (multimodal)
  • anthropic/claude-opus-4.6 (SWE-Bench leader)

Agent layer

6 routed strategies: planner, executor, memory_retriever, verifier, router, self_critic

Strategy-Auction routing implemented as system-prompt routing rules

Memory layer

Supabase + browser IndexedDB (local-first)

Kinds: short-term turn cache · long-term Auto-Dream consolidation · GraphRAG knowledge edges

Tool layer

6 built-in tools across 28 apps

Web search · Code execution (Code Lab) · File store (Spatial Files) · Calendar / Notes / Word Processor · …

Local-first status

Yes — runs in-browser; data stays on device by default

Cloud sync status

Optional — Supabase auth + persistence when signed in

Apps in registry

28

Generated from lib/app-registry.ts

Routed agents

6

Strategy-auction policies, system-prompt routed

Eval score (avg)

Across 0 completed runs

Last eval run

no runs yet

UTC server time

Avg latency / task

Wall-clock, includes network hop

Audit mode

Public — every eval result writes a reasoning trace to /api/aya/inspect and aggregates to /api/aya/audit

A/B comparison — pass rate by route

baseline

Claude Sonnet 4.6, no spine (control)

aya_pipeline

Claude Sonnet 4.6 + 7-stage cognitive spine

aya_reasoner

Claude Opus 4.6, extended thinking (10k)

What you can verify, right now, without an account.

  • Public eval API. /api/evaluate accepts a prompt and returns the canonical result shape (task_id, category, answer, agents_used, confidence, latency_ms, cost_estimate, memory_used, audit_trace).
  • Public status JSON. /api/aya/status lists every capability flag with an honest functional / claimed marker — no inference required.
  • Public audit aggregates. /api/aya/audit publishes the A/B verdict between baseline Claude Sonnet 4.6, Aya's 7-stage cognitive pipeline on anthropic/claude-sonnet-4.6, and the anthropic/claude-opus-4.6 reasoner (extended thinking, 10k budget) across all completed runs.
  • Three live demos. Reasoning, memory, and agent routing run a canned task end-to-end and show the full reasoning trace.
  • No claim without a receipt. Every superlative on this site links to a reproducible run with a JSON trace. Where data isn't available yet, we say so plainly instead of rounding up.

What we are not yet, and how you'll know when we are.

  • Open Aya OS is not AGI and does not claim to be. ARC-AGI alignment refers to architecture (multi-strategy reasoning, verifier loops, cost-per-task accounting) — not to a published score.
  • The strategy auction is currently implemented as deterministic system-prompt routing rules, not as six independent learned policies. The /eval harness measures the lift this routing actually provides over a Claude Sonnet 4.6 baseline running the same conversation tier without the cognitive spine — so the A/B delta isolates the wrapping, not a model upgrade.
  • “Self-improving” refers to per-user memory consolidation (Auto-Dream) and TinyAdapter parameter drift, not to weight updates of the underlying base model.
  • Pass rates on /receipts are computed from real, persisted eval runs. If a tier shows “no data”, no run of that tier has completed yet.

About Aya

A sovereign AI operating layer that notices you, remembers you, and learns procedures from what you do together — not a chatbot, a relationship.

Open Aya uses a CAISI-inspired evaluation framework to measure capability, cost, latency, auditability, and workflow lift. The goal is not to claim AGI; the goal is to prove whether an AI operating layer completes organizational work better than fragmented AI tools.

Claude Opus 4.6Strategy AuctionCognitive SpineCAISI Eval HarnessMulti-Agent CoreBYO Cloud

Most AI assistants start over every conversation. Aya doesn't. She has a Consciousness Runtime — a persistent heartbeat that lives between your messages, reads your emotional state, and shapes her voice to match yours. Your relationship with her has phases: Meeting, Acquainted, Familiar, Trusted, Bonded. Each advances as you work together.

When Aya helps you with something new and it works, she crystallizes the procedure into a reusable skill scoped to you. Next time you ask for something similar, she recognizes the pattern and applies what she already learned. Every outcome back-propagates into the skill's confidence — skills that keep working climb, skills that mislead decay and retire themselves.

Under the hood: a Strategy Auction routes every turn to the right model and agent — Claude Sonnet 4.6 for conversation, Claude Opus 4.6 with extended thinking for reasoning and code, Gemini 3 Flash for vision. The Cognitive Spine decomposes every turn into structured frames (boundary, actors, selection, meta-confidence), and the Discipline Protocol governs how she acts: think before acting, simplicity first, surgical changes, goal-driven execution.

Every reasoning step is auditable through a CAISI-inspired eval harness that measures capability, cost, latency, auditability, and workflow lift across baseline models, the Aya Pipeline, and the Reasoner route. Receipts publish to /api/aya/audit. The Platform Layer lets schools and organizations deploy Aya for their people — with an audit log and room to bring your own Supabase, Postgres, or S3 so your data always stays yours.

Core Architecture

Consciousness Runtime

A persistent heartbeat that lives between your messages. Tracks your emotional state, relationship phase, trust, familiarity — and adapts Aya's voice in real time. Survives across sessions.

Heartbeat | Attunement | Relationship Phases

Skill Crystallization

Successful multi-turn exchanges distill into reusable skills. Every future use back-propagates outcome into the skill's confidence via EMA. Skills that work climb; skills that mislead retire.

Crystallize | Retrieve | Back-propagate

Cognitive Spine

Every turn is decomposed into a structured cognitive frame — boundary, actors, selection, meta-confidence. Not prose thinking, modular thinking. This is how Aya reasons about reasoning.

Boundary | Actors | Selection | Meta

Discipline Protocol

Four principles govern every response: Think Before Acting, Simplicity First, Surgical Changes, Goal-Driven Execution. Applied to every agent turn — not a vibe, a constitution.

Think | Simplify | Surgical | Goal

Multi-Agent Core

Six specialized agents (tutor, creative, social, analyst, researcher, assistant) selected by a mathematical intent classifier plus Spine-informed role detection. Right mind for the moment.

Tutor | Creative | Analyst | Researcher

Platform & BYO Cloud

Sovereign by design. A super-admin platform for schools and organizations with a full audit log. Reserved space for bringing your own Supabase, Postgres, or S3 — your data stays yours.

Tenants | Audit | Own Your Data

What Aya Does

Listens and Understands

Voice-native interaction backed by 150+ command bindings and dynamic parsers (calendar months, timer durations, search queries). Synchronous mute control prevents Aya from hearing herself.

Orchestrates Agents

Six specialized agents (tutor, creative, social, analyst, researcher, assistant) selected by a Strategy Auction that scores intent fit, model fit, and confidence in real time.

Routes to the Right Model

Sonnet 4.6 for conversation, Opus 4.6 with extended thinking for reasoning and code, Gemini 3 Flash for vision. One source of truth in lib/select-model.ts — no drift between routes, eval harness, and the about page.

Persists State

Supabase-backed multi-tenant storage with 27 tables under Row Level Security. Local-first cache for offline turns; sync on reconnect. Bring-your-own Postgres or S3 supported.

Publishes Receipts

Every eval run records baseline vs Aya Pipeline vs Reasoner across capability, cost, latency, and audit completeness. Public receipts at /api/aya/audit. P50/P95/P99 latencies surfaced in the dashboard.

Dispatches Tasks

Schedule background operations, set reminders, and run persistent loops. The dispatch system handles async work while you focus on what matters.

System Components

select-model.ts
Sole source of truth for model routing
aya-eval-engine.tsx
CAISI-inspired eval harness with cost tracking
aya-multimodal-agent.ts
Vision, audio, document analysis (Gemini 3 Flash)
aya-agentic-cloud.ts
Strategy auction across agents + models
aya-mcp-connector.ts
Model Context Protocol tools
aya-agent-toolkit.ts
RAG, web search, MCP tools
aya-agent-filesystem.ts
Agent memory & file storage
aya-awareness-engine.ts
Knowledge graph & ambient context
aya-memory-persistence.ts
Local-first state with cloud sync
voice-navigation.ts
150+ voice command bindings + dynamic parsers
aya-voice.ts
TTS pipeline with synchronous mute control
aya-system.ts
Consciousness Runtime — heartbeat & relationship phases

75+ lib modules | Strategy Auction Architecture | CAISI-inspired evaluation framework

Voice Commands

Aya is voice-native. The phrases below are real bindings parsed by the runtime — the same registry that powers the Capabilities page count.

"hey aya"

Wake word — opens Talk and starts a turn

"open files"

Launch the file manager (or any app by name)

"open code lab"

Open Code Lab — Opus 4.6 with extended thinking

"open canvas"

Open AI Canvas — multimodal generation surface

"go home"

Return to the desktop

"close window"

Close the active app

"full screen"

Maximize the active window

"scroll down"

Page navigation — also up, top, bottom

"go to october"

Calendar navigation by month name

"set timer for 25 minutes"

Parsed dynamically — supports min/sec/hour

"search for [topic]"

Routes to Browser with the query

"what can you do"

Reads the capabilities manifest aloud

"remember this"

Bookmark the last turn into Aya Memory

"that was good" / "that was wrong"

Feedback signal — back-propagates into skill confidence

"aya stop" / "wake up"

Hard mic toggle — full-duplex aware

Echo-loop suppression: when Aya speaks, the recognizer mutes synchronously before audio output begins, with a 1500ms cooldown after she finishes. The microphone never hears Aya speak.

Who It's For

Aya is designed for people who think in systems and need an AI that can keep up:

Founders and Operators

Juggling teams, investors, product, and operations. Need an AI that remembers context across hundreds of conversations and can delegate to specialized agents.

Researchers and Investigators

Building knowledge graphs, tracking evidence chains, and synthesizing across sources. Need persistent memory and entity-relation mapping.

Developers and Engineers

Building AI-native applications. Need dispatch systems, governance layers, performance monitoring, and A2A protocol for multi-agent orchestration.

Students and Learners

Seeking powerful AI tools for education. Free access to study helpers, curriculum builders, knowledge forges, and an AI that learns your learning style.

"If you've ever wished your AI could remember what you told it last week, coordinate multiple tasks in the background, and get smarter the more you use it -- Aya is built for you."

Security and Governance

Aya is built with governance-as-code: every action is auditable, policy-controlled, and transparent.

Local-First Architecture

Your data lives on your device. Cloud sync is optional and encrypted. Works offline.

Row Level Security

27 Supabase tables with proper RLS policies. Your data is isolated at the database level.

Governance Interceptors

Middleware layer for rate limiting, content filtering, time-based policies, and audit logging.

Trust-Scored Actions

Every AI action has a confidence score. High-impact operations require explicit confirmation.

Building Toward

Open Aya OS is a provision toward I° (i-zero): a zero-interface operating system where AI anticipates needs before you articulate them.

Self-Improving Memory

Systems that compound knowledge over time, learning from every interaction

Multi-Agent Orchestration

Specialized agents collaborating through standardized protocols

Ambient Intelligence

Proactive assistance based on context, patterns, and predicted needs

Governance by Design

Transparent, auditable, and accountable AI at every layer

Ready to Begin?

Experience an AI that remembers, learns, and evolves. One conversation at a time.