v2.0 — Self-improving AI

Open Aya OS — the agentic, in-browser cognitive operating system.

Open Aya uses a CAISI-inspired evaluation framework to measure capability, cost, latency, auditability, and workflow lift across baseline models, the Aya Pipeline, and reasoner routes. The goal is not to claim AGI; the goal is to prove whether an AI operating layer completes organizational work better than fragmented AI tools.

28 integrated apps. Voice-native, vision-aware, local-first with optional cloud sync. Multi-agent routing across planner, executor, memory, verifier, and critic strategies — every step auditable through a public benchmark harness, not a brochure.

Open Aya OS

Intelligence Card

Live system facts. Every number on this card is generated at request time from the runtime registry or the public eval database — there is no separate marketing source to drift.

Live

Model layer

anthropic/claude-sonnet-4.6 (conversation tier)
anthropic/claude-opus-4.6 (extended thinking, 10k budget)
google/gemini-3-flash (multimodal)
anthropic/claude-opus-4.6 (SWE-Bench leader)

Agent layer

6 routed strategies: planner, executor, memory_retriever, verifier, router, self_critic

Strategy-Auction routing implemented as system-prompt routing rules

Memory layer

Supabase + browser IndexedDB (local-first)

Kinds: short-term turn cache · long-term Auto-Dream consolidation · GraphRAG knowledge edges

Tool layer

6 built-in tools across 28 apps

Web search · Code execution (Code Lab) · File store (Spatial Files) · Calendar / Notes / Word Processor · …

Local-first status

Yes — runs in-browser; data stays on device by default

Cloud sync status

Optional — Supabase auth + persistence when signed in

Apps in registry

Generated from lib/app-registry.ts

Routed agents

Strategy-auction policies, system-prompt routed

Eval score (avg)

—

Across 0 completed runs

Last eval run

no runs yet

UTC server time

Avg latency / task

—

Wall-clock, includes network hop

Audit mode

Public — every eval result writes a reasoning trace to /api/aya/inspect and aggregates to /api/aya/audit

A/B comparison — pass rate by route

baseline

—

Claude Sonnet 4.6, no spine (control)

aya_pipeline

—

Claude Sonnet 4.6 + 7-stage cognitive spine

aya_reasoner

—

Claude Opus 4.6, extended thinking (10k)

What you can verify, right now, without an account.

Public eval API. /api/evaluate accepts a prompt and returns the canonical result shape (task_id, category, answer, agents_used, confidence, latency_ms, cost_estimate, memory_used, audit_trace).
Public status JSON. /api/aya/status lists every capability flag with an honest functional / claimed marker — no inference required.
Public audit aggregates. /api/aya/audit publishes the A/B verdict between baseline Claude Sonnet 4.6, Aya's 7-stage cognitive pipeline on anthropic/claude-sonnet-4.6, and the anthropic/claude-opus-4.6 reasoner (extended thinking, 10k budget) across all completed runs.
Three live demos. Reasoning, memory, and agent routing run a canned task end-to-end and show the full reasoning trace.
No claim without a receipt. Every superlative on this site links to a reproducible run with a JSON trace. Where data isn't available yet, we say so plainly instead of rounding up.

What we are not yet, and how you'll know when we are.

Open Aya OS is not AGI and does not claim to be. ARC-AGI alignment refers to architecture (multi-strategy reasoning, verifier loops, cost-per-task accounting) — not to a published score.
The strategy auction is currently implemented as deterministic system-prompt routing rules, not as six independent learned policies. The /eval harness measures the lift this routing actually provides over a Claude Sonnet 4.6 baseline running the same conversation tier without the cognitive spine — so the A/B delta isolates the wrapping, not a model upgrade.
“Self-improving” refers to per-user memory consolidation (Auto-Dream) and TinyAdapter parameter drift, not to weight updates of the underlying base model.
Pass rates on /receipts are computed from real, persisted eval runs. If a tier shows “no data”, no run of that tier has completed yet.

Server-rendered at request time

Capes

34 apps. 124+ voice commands. All real.

Open Aya uses a CAISI-inspired evaluation framework to measure capability, cost, latency, auditability, and workflow lift. The goal is not to claim AGI — it's to prove whether an AI operating layer completes organizational work better than fragmented AI tools.

Every entry below is loaded from the same registry the OS reads at boot. No marketing inflation, no “coming soon” placeholders.

Open the OS See benchmark receipts

Model Stack

Live values from lib/select-model.ts. Same IDs the production /api/aya-chat route sends to the AI Gateway.

Conversation

anthropic/claude-sonnet-4.6

conversation tier

Reasoning

anthropic/claude-opus-4.6

extended thinking, 10k budget

Vision

google/gemini-3-flash

multimodal

Code

anthropic/claude-opus-4.6

SWE-Bench leader

Baseline (control)

Claude Sonnet 4.6

micro / classifier

Embeddings

openai/text-embedding-3-small

vector embeddings

Override via AYA_MODEL_* env vars in Vercel dashboard. Changes propagate on next request — no rebuild required.

Productivity

Get-things-done apps for daily work — notes, calendar, tasks, docs.

apps

App Builder

Visual app creation tool

“build app”“app builder”“create app”

Calendar

Calendar with event management

“open calendar”“calendar”“show schedule”

Code Lab

Code editor with live preview

“open code”“code lab”“code editor”

Notes

Note-taking with voice dictation, drawing canvas, and sticky notes

“open notes”“notes”“take note”+1 more

Talk

Execution-first AI assistant with tool generation

“open talk”“talk to aya”“chat”

Word Processor

Rich text editor with AI assistance, voice dictation, and dual input modes

“word processor”“open document”“write document”+2 more

Workflow Composer

Automation workflow builder

“open workflow”“workflow”“automate”

Workspace

AI-powered productivity workspace for work and study

“workspace”“aya workspace”“work”+5 more

Learning

Study tools, flashcards, curriculum trackers, knowledge surfaces.

apps

AI Curriculum

Structured AI engineering learning path

“ai curriculum”“ai engineer”“learning path”

Knowledge Forge

Knowledge management and learning

“open knowledge”“knowledge forge”“learn”

Knowledge Map

Galaxy-style knowledge visualization

“open map”“knowledge map”“explore”

Learning Paths

Personalized learning journeys

“learning paths”“courses”“study plan”

Study Helper

Study session planning and tracking

“study helper”“help me study”“flashcards”

Creative

AI canvas, generative tools, design and writing surfaces.

apps

AI Canvas

Interactive AI drawing and brainstorming

“open canvas”“ai canvas”“draw”

AI Studio

AI-powered voice, music, and video creation

“open studio”“ai studio”“create”

Image Studio

Image editing with AI enhancement

“create image”“image studio”“edit image”

Utility

Calculators, timers, dashboards, system inspectors.

apps

About Aya

About Open Aya OS and the I° mission

“about aya”“about”“info”

Admin Console

Organization management, user roles, and Aya configuration

“admin”“admin dashboard”“admin console”+2 more

AI Analytics

Multi-model AI performance tracking and analytics

“ai analytics”“analytics”“performance”

Aya Dashboard

Multi-agent coordination system inspired by Epiminds Lucy

“aya dashboard”“dashboard”“agents”

Beam Transfer

Fast file transfer utility

“beam file”“transfer”“send file”

Browser

Web browser with bookmarks

“open browser”“web browser”“search web”

Calculator

Simple calculator with voice input support

“open calculator”“calculator”

Developer Hub

Developer tools and API explorer

“developer”“dev hub”“developer hub”+1 more

Eval Harness

ARC, Turing, and functional-OS benchmarks with reasoning trace inspection — verifiable receipts for Aya's cognitive claims

“eval harness”“evaluation”“benchmarks”+5 more

Files

Futuristic file explorer with network view

“open files”“file manager”“my files”

Forge

Quantum manifold optimization and memory substrate control

“forge”“quantum”“manifold”+2 more

Memory

View and manage Aya's learning memories

“memory”“aya memory”“memories”+1 more

Settings

System settings and preferences

“open settings”“settings”“preferences”

Timer

Countdown timer with notifications

“set timer”“timer”“countdown”

Entertainment

Games, TVs, ambient experiences, downtime apps.

apps

Media Hub

Radio, Live TV, Retro TV, and Auracast broadcasting

“open media hub”“media hub”“open radio”+4 more

Music Player

Music player with playlist management

“open music”“play music”“music player”

Neon Type

Cyberpunk typing defense game

“open typing game”“neon type”“typing practice”

Photo Gallery

Photo viewer with favorites

“open photos”“photo gallery”“show photos”

This page uses export const dynamic = "force-dynamic" — server-rendered on every request from lib/select-model.ts and lib/app-registry.ts. What you see above is the live production stack. A crawler grep'ing this HTML sees exactly what /api/aya-chat serves.

See benchmark receipts GET /api/aya/status

OpenAya OS

What you can verify, right now, without an account.

What we are not yet, and how you'll know when we are.

Capes

Model Stack

Productivity

App Builder

Calendar

Code Lab

Notes

Talk

Word Processor

Workflow Composer

Workspace

Learning

AI Curriculum

Knowledge Forge

Knowledge Map

Learning Paths

Study Helper

Creative

AI Canvas

AI Studio

Image Studio

Utility

About Aya

Admin Console

AI Analytics

Aya Dashboard

Beam Transfer

Browser

Calculator

Developer Hub

Eval Harness

Files

Forge

Memory

Settings

Timer

Entertainment

Media Hub

Music Player

Neon Type

Photo Gallery