The Agent Memory Architectures

How agents store and recall facts across a long task using short term and long term memory.

Why agents need memory

A model only sees what fits in its context window. For a long task the agent must store useful facts and retrieve them later, so memory becomes a deliberate system, not just the raw prompt.

Two layers

Short term memory is the running context: recent messages, tool results, and the current plan. It is fast but limited in size.
Long term memory is an external store such as a vector database or a notes file. The agent writes summaries there and searches them when needed.

How recall works

When the context fills up, older turns are summarized and pushed to long term storage. Before acting, the agent queries that store for relevant past facts and pulls the top matches back into context.

Design choices

What to save: facts, decisions, and user preferences, not raw chatter.
When to retrieve: on demand by similarity, or on a fixed schedule.
How to forget: expire stale entries so memory does not drift or bloat.

Memory done well lets an agent stay coherent across hours of work that no single window could hold.

Key idea