Skip to content
Back to blog
Announcements·Apr 12, 2026·AgentOS Team

Announcing AgentOS: Open-Source TypeScript AI Agent Runtime

AgentOS is an open-source TypeScript runtime for building AI agents with cognitive memory, HEXACO personality, multi-agent orchestration, runtime tool forging, 37 channel adapters, and 21 LLM providers. Apache 2.0 licensed.
AgentOSEngineering Notes

Announcing AgentOS: Open-Source TypeScript AI Agent Runtime

AgentOS is an open-source TypeScript runtime for building AI agents with cognitive memory, HEXACO personality, multi-agent orchestration, runtime tool forging, 37 channel adapters, and 21 LLM providers. Apache 2.0 licensed.

April 12, 2026 · AgentOS Team

"So we and our elaborately evolving computers may meet each other halfway. Someday a human being, named perhaps Fred White, may shoot a robot named Pete Something-or-other, which has come out of a General Electric factory, and to his surprise see it weep and bleed. And the dying robot may shoot back and, to its surprise, see a wisp of gray smoke arise from the electric pump that it supposed was Mr. White's beating heart."

— Philip K. Dick, The Android and the Human, 1972

Most agent frameworks treat the LLM as a function call, hand the result back to your application, and let your application do everything that should outlive the call. Memory across sessions, personality consistency, tool registration, multi-agent coordination, retry on tool failure: all of it ends up in handlers you write yourself. After enough handlers, the application code is the agent and the framework is a thin shim over the model.

AgentOS puts those concerns inside the runtime. Persistent cognitive memory, optional HEXACO personality, runtime tool forging in a hardened sandbox, six multi-agent orchestration strategies, streaming guardrails, a voice pipeline, and one dispatch interface across the major LLM providers. This post is the announcement that the project is open source under Apache 2.0 and that the public benchmark numbers are real.

The short version: npm install @framers/agentos.

What AgentOS Is

AgentOS is a TypeScript runtime for building AI agents that adapt, remember, and collaborate. Every agent is a Generalized Mind Instance (GMI): a persistent cognitive core with personality traits, episodic memory, and autonomous decision-making.

npm install @framers/agentos
1import { agent } from '@framers/agentos';
2
3const bot = agent({
4  provider: 'anthropic',
5  instructions: 'You are a helpful assistant.',
6  personality: { openness: 0.9, conscientiousness: 0.85 },
7  memory: { enabled: true, cognitive: true },
8});
9
10const reply = await bot.session('demo').send('What is AgentOS?');
11console.log(reply.text);

What Makes It Different

Cognitive Memory

8 neuroscience-grounded memory mechanisms modulated by the agent's HEXACO personality:

Memory follows a 4-tier hierarchy (working memory, episodic, semantic, observational) that consolidates upward automatically. This approach is grounded in the same ACT-R cognitive architecture principles used by recent systems like Memory Bear and CortexGraph.

Multi-Agent Orchestration

6 coordination strategies for teams of specialized agents:

StrategyDescriptionUse Case
SequentialLinear pipeline, each agent refines previous outputEditing chains, translation pipelines
ParallelFan-out to all agents simultaneouslyResearch, brainstorming, redundancy
DebateAgents argue positions, synthesize consensusDecision-making under uncertainty
Review loopAuthor and reviewer iterate until quality thresholdContent QA, code review
HierarchicalManager delegates to specialized workersTask decomposition
Graph (DAG)Dependency-based execution with conditional branchingComplex multi-step workflows

Agents share memory through the AgentCommunicationBus and coordinate via the AgencyRegistry.

When strategy: 'hierarchical' is paired with emergent: { enabled: true }, the manager LLM gets a spawn_specialist tool alongside its delegate_to_<name> tools. Calling it forges a new sub-agent at runtime via EmergentAgentForge, gates it through EmergentAgentJudge for safety review, and adds the new specialist to the live roster, so delegate_to_<spawned-role> becomes available on the next turn. Bounds via planner.maxSpecialists, planner.maxJudgeCalls, and an optional HITL beforeEmergent gate. See Hierarchical + emergent agent spawning for the worked example.

AgentOS spawning a security_audit_specialist agent at runtime, side-by-side with the source code

The image above is captured from a real node examples/emergent-hierarchical-spawning.mjs run. The team starts with researcher + writer; the prompt requires a security audit; the manager calls spawn_specialist, EmergentAgentJudge approves the spec, and security_audit_specialist joins the live roster. The [FORGE] line is the moment that happens. Reproduce: node examples/emergent-hierarchical-spawning.mjs after npm install @framers/agentos and export OPENAI_API_KEY=....

Emergent Tool Forging

Agents create new tools at runtime when no existing tool fits the task:

  • Compose mode: chains existing tools into pipelines (safe by construction)
  • Sandbox mode: generates code in a memory-bounded, time-limited isolation environment

An EmergentJudge reviews safety and correctness before activation. Approved tools promote through 3 trust tiers: session, agent, shared. The EmergentToolRegistry tracks usage and confidence scores.

Production Infrastructure

TypeScript Native

Full type safety with Zod-validated structured output. ESM-first architecture. The TypeDoc API reference documents every public class, interface, and function.

How AgentOS Compares

CapabilityAgentOSLangGraphCrewAIMastraVoltAgent
LanguageTypeScriptPython + JSPythonTypeScriptTypeScript
Cognitive memory8 mechanisms + Ebbinghaus decayCheckpointsShort/long-termSemanticConversation + RAG
PersonalityHEXACO 6-factorNoneRole descriptionsNoneNone
Channel adapters37NoneNoneNoneNone
Voice pipeline12 STT + 12 TTSNoneNoneNoneNone
Guardrails6 packsMiddlewareBasicNoneModule
Tool forgingRuntime creationNoneNoneNoneNone

See AgentOS vs LangGraph vs CrewAI vs Mastra vs VoltAgent for the full comparison.

What we measured (and what we didn't)

AgentOS ships with agentos-bench, an Apache 2.0 memory benchmark suite. We publish bootstrap CIs at 10k resamples on every headline number and the per-cell run JSON for replication. The recent results:

  • LongMemEval-S: 85.6% [82.4%, 88.6%] at $0.0090 per correct, gpt-4o reader, 4-second avg latency. Beats Mastra OM gpt-4o (84.2% published) on accuracy at matched reader. Beats EmergenceMem Simple Fast (80.6% measured in our harness, their public reference repo ships with no LICENSE) by +5.0 points at 6.5x lower cost-per-correct. Statistically tied with EmergenceMem Internal's published 86.0%, but Emergence's number comes from closed-source SaaS at emergence.ai/web-automation-api, not a library you can install. AgentOS ships the full architecture under Apache-2.0.
  • LongMemEval-M (1.5M tokens, 500 sessions per haystack): 70.2% [66.0%, 74.0%] at $0.0078 per correct with reader-router top-K=5. Competitive with the strongest published M results in the original LongMemEval paper (Wu et al, ICLR 2025, Table 3). At matched reader-Top-5, +4.5 points above the paper's round-level configuration (65.7%) and 1.2 below the paper's session-level configuration (71.4%); 1.8 below the paper's overall best (72.0% at round-level Top-10).

We do not run benchmarks against vendors that don't ship complete standalone runnables. We do not claim X-times-cheaper unless reader model and config match between the two systems being compared. The entire methodology (judges, sample sizes, judge FPR probes, adversarial calibration) is documented in agentos-bench/docs.

This is what an honest benchmark looks like. If something on this list is wrong, file an issue against agentos-bench and we'll fix it or retract the claim.

Get Started

FAQ

Why TypeScript? Most AI infrastructure is Python. Most production application code is JavaScript or TypeScript. The runtime that lives inside an application should match the application's language. AgentOS does. The Python interop story is via the API server (REST/SSE) or via JSON Schema-generated types from the artifact schema.

Is AgentOS a LangGraph alternative? It's an alternative if your job is "build an AI agent with memory and personality and tools." It is not an alternative if your job is "compose Python research code into a workflow." Different jobs. We have a head-to-head comparison post with honest-cost-rule applied.

Does AgentOS lock me into a specific LLM? No. 21 provider adapters ship; you can add yours in ~50 lines if it's not in the list. The provider abstraction is decoupled from the agent abstraction.

What's the deal with HEXACO personality? It's optional. Pass personality and the runtime biases retrieval, decision routing, and tool selection by the trait vector. Don't pass it and the runtime acts personality-neutral. We don't make HEXACO the centerpiece because not every agent needs it; we just make it work cleanly when you want it.

Is the cognitive memory feature the same as RAG? No. RAG retrieves documents. Cognitive memory is a layered system covering short-term context, episodic memory (events with time and emotion encoding), semantic memory (facts and relationships), and a Ebbinghaus-style decay model for forgetting. RAG is one of the retrievers cognitive memory composes with. Read Cognitive Memory Beyond RAG for the deeper version.

Can AgentOS run agents that talk on the phone? Yes. The voice pipeline (12 STT providers, 12 TTS providers) plus telephony adapters in the channels system gets you a voice agent that runs as a phone call. We have a case study with Wilds.ai on the companion side of the same stack.

What about safety? Six guardrail packs ship: topicality, ML classifiers (toxicity / prompt-injection / NSFW), grounding (NLI), PII redaction, code safety (static analysis), and skills (curated SKILL.md execution). Each has its own README in the packages/agentos-ext-* tree.

Is this production-ready for X? Define X. We don't publish a generic "production-ready" label because the answer depends on what you're shipping. We do publish concrete benchmark numbers, concrete safety posture, and concrete provider compatibility. Read those, judge for yourself, file issues when we miss.

Apache 2.0 licensed. Built by Manic Agency / Frame.dev.

Comments