UnieAI Agent Core — the agent harness & runtime

Introducing UnieAI Agent Core

Tools · MCP

Session

Harness

Sandbox

Orchestration

One harness, orchestrated.

Agent Core decides when the model reasons, which tools it calls via MCP, what it remembers, and runs it safely in a sandbox.

AIME 2025 — accuracy

higher is better

Baseline scores from the public leaderboard; the UnieAI bar is our internal result.

99.0%

97.2%

96.7%

93.4%

89.3%

89.0%

83.7%

78.3%

GPT-5.2 (xhigh)

MiniMax-M2 × UnieAI Agent Core 2

GPT-5.2 (medium)

gpt-oss-120b (high)

gpt-oss-20B (high)

Nova 2.0 Pro

Claude 4.5 Haiku

MiniMax-M2 (baseline)

Stronger models, stable agents.

A purpose-built harness makes models stronger and more reliable. Agent Core 2 lifts MiniMax-M2 on AIME from 78.3% to 97.2%.

Agent Core optimizes the CPU, hundreds of agents per process. Read more UnieInfra optimizes the GPU, 2× throughput at low load. Read more

+0.0pt

AIME 2025 uplift on MiniMax-M2 (78.3% → 97.2%) with Agent Core 2

100s–1,000

Concurrent agent turns per replica vs. ~10–40 for sandbox-per-agent*

~0 MB

Memory per agent turn — ~0 CPU while waiting on model & tools*

Cold start — stateless, linear horizontal scaling

converging

Two halves of one agent-inference engine.

UnieAI Agent Core

CPU efficiency

Decoupled, async agent runtime — hundreds of concurrent turns per process, single-digit MB each.

UnieInfra

GPU efficiency

Token-efficient throughput density and low TTFT for agent inference. Converging with Agent Core into one engine.

One harness — smarter models, cheaper turns.

Planning & orchestration

Decides when the model reasons, calls tools, and how a task is decomposed.

Tools & MCP

Bash, file edit, web & KB search, and any MCP server as a pluggable tool source.

Sandbox

Run code and tools safely — decoupled from the agent loop, not a VM per agent.

Memory & session

A resumable timeline ledger persists every turn for replay and observability.

Agentic RAG

Tree-based retrieval grounds answers in your knowledge base, with sources.

Async I/O runtime

I/O-bound turns are multiplexed — a waiting turn uses ~0 CPU, only its context memory.

01 / 02

intelligence

A better harness makes open models smarter

The same open model gets materially stronger and more reliable when it runs inside a purpose-built harness — better planning, better tool use, stable loops.

Agent Core 2 lifts MiniMax-M2 on AIME 2025 from 78.3% to 97.2%
Stable agent loops on open-source models
The same harness powers UnieAI Chat, Code & Studio

02 / 02

economics

Hundreds of agents per process — not dozens of VMs

Traditional frameworks give every agent its own sandbox: hundreds of MB to GB each, seconds to cold-start, so high concurrency means spinning up hundreds of VMs. Agent Core decouples the agent into efficient services and multiplexes I/O-bound turns in a single process.

~2–4 MB per turn; ~0 CPU while waiting on model & tools
No cold start — turns spin up instantly
Stateless replicas scale linearly: 10 replicas ≈ 2,560 concurrent turns

Build agents on a harness you can trust.

Talk to engineering