Agentic Cache Offloading — What Forces SSD?

Working-set storage projections for agentic caches — from lean plan templates to fat agent memory stores.

📏 Modeled: steady-state working-set (cache footprint), not cumulative history. Templates + index = RAM. Artifacts (tools, experiences, traces) = disk proxy. → see Background tab for details

Preset

Cache shape

Template size

Embedding

Tool-call results ? Cached API/tool responses (search snippets, code outputs)

Experiences ? ACE-style playbook entries, learned strategies

Exec traces ? Full reasoning chains, reward traces, step logs

Multipliers

Replicas

Tenants

Index overhead ? ANN graph + metadata beyond raw vectors

Cache RAM fraction ? % of system RAM for cache (rest: model, KV, OS)

Domain growth ? Cumulative ceiling annual expansion

Agent adoption

Years

Bottom line

Lean plan-template caches produce a tiny working set (500K templates at 2 KiB = ~1 GiB). SSD is forced by artifact richness x copies , not queries/day. Minimal plan caching does not force SSD by 2030. Multi-tenant replication + embedding indexes + rich artifacts (tool results, experiences, exec traces) can.