Agentic Cache Offloading — What Forces SSD?

Working-set storage projections for agentic caches — from lean plan templates to fat agent memory stores.

📏 Modeled: steady-state working-set (cache footprint), not cumulative history. Templates + index = RAM. Artifacts (tools, experiences, traces) = disk proxy. → see Background tab for details
Preset
Cache shape
Template size
Embedding
Tool-call results ? Cached API/tool responses (search snippets, code outputs)
Experiences ? ACE-style playbook entries, learned strategies
Exec traces ? Full reasoning chains, reward traces, step logs
Multipliers
Replicas
Tenants
Index overhead ? ANN graph + metadata beyond raw vectors
Cache RAM fraction ? % of system RAM for cache (rest: model, KV, OS)
Domain growth ? Cumulative ceiling annual expansion
Agent adoption
Years
Bottom line
Lean plan-template caches produce a tiny working set (500K templates at 2 KiB = ~1 GiB). SSD is forced by artifact richness x copies , not queries/day. Minimal plan caching does not force SSD by 2030. Multi-tenant replication + embedding indexes + rich artifacts (tool results, experiences, exec traces) can.