UnieAI Agent Core — the agent harness & runtime

the harness

UnieAI Agent Core — a better harness for open models

Agent Core is the harness that makes open models smarter and the runtime that makes them cheap to run. It decouples session, sandbox, filesystem, orchestration and tools into efficient services — so hundreds of agents run in one process, not dozens of VMs.

Talk to engineering

Concurrent agents · one 4 GB host

more is better

Turns are I/O-bound — a waiting turn uses ~0 CPU, only ~3 MB of context memory.

Sandbox-per-agent (VM / container)~10–40

UnieAI Agent Corehundreds–~1,000

~3 MB

per agent turn

CPU while waiting

0 s

cold start

* Modeling estimate, not a load test.

+0.0pt

AIME 2025 uplift on MiniMax-M2 (78.3% → 97.2%) with Agent Core 2

100s–1,000

Concurrent agent turns per replica vs. ~10–40 for sandbox-per-agent*

~0 MB

Memory per agent turn — ~0 CPU while waiting on model & tools*

Cold start — stateless, linear horizontal scaling

converging

Two halves of one agent-inference engine.

UnieAI Agent Core

CPU efficiency

Decoupled, async agent runtime — hundreds of concurrent turns per process, single-digit MB each.

UnieInfra

GPU efficiency

Token-efficient throughput density and low TTFT for agent inference. Converging with Agent Core into one engine.

One harness — smarter models, cheaper turns.

Planning & orchestration

Decides when the model reasons, calls tools, and how a task is decomposed.

Tools & MCP

Bash, file edit, web & KB search, and any MCP server as a pluggable tool source.

Sandbox

Run code and tools safely — decoupled from the agent loop, not a VM per agent.

Memory & session

A resumable timeline ledger persists every turn for replay and observability.

Agentic RAG

Tree-based retrieval grounds answers in your knowledge base, with sources.

Async I/O runtime

I/O-bound turns are multiplexed — a waiting turn uses ~0 CPU, only its context memory.

01 / 02

intelligence

A better harness makes open models smarter

The same open model gets materially stronger and more reliable when it runs inside a purpose-built harness — better planning, better tool use, stable loops.

Agent Core 2 lifts MiniMax-M2 on AIME 2025 from 78.3% to 97.2%
Stable agent loops on open-source models
The same harness powers UnieAI Chat, Code & Studio

02 / 02

economics

Hundreds of agents per process — not dozens of VMs

Traditional frameworks give every agent its own sandbox: hundreds of MB to GB each, seconds to cold-start, so high concurrency means spinning up hundreds of VMs. Agent Core decouples the agent into efficient services and multiplexes I/O-bound turns in a single process.

~2–4 MB per turn; ~0 CPU while waiting on model & tools
No cold start — turns spin up instantly
Stateless replicas scale linearly: 10 replicas ≈ 2,560 concurrent turns

Build agents on a harness you can trust.

Talk to engineering