Every model has a context window. None of them have memory. The difference matters more than most builders realize.
A context window is scratch paper. It holds what you paste in, reasons over it, and forgets everything when the conversation ends. Every new session starts blank. You re-explain your project, your preferences, the decisions you already made last week. The model performs well for twenty minutes and then the session ends and all of that work evaporates.
People paper over this in three ways, and each one fails.
The first is longer context windows. Anthropic, OpenAI, and Google have been in a context-length arms race for two years. One million tokens, two million, pin the whole codebase. This helps with single-session reasoning but does nothing for continuity. A one-million-token window that resets at session boundaries is still a scratch pad, just a bigger one.
The second is fine-tuning. Train a model on your data, the reasoning goes, and it will know what you know. In practice, fine-tuning costs real money, takes real time, captures only the data you had on training day, and produces a frozen artifact. Your project changes weekly. Your preferences change. Your team rotates. The fine-tuned model does not.
The third is platform-specific memory. ChatGPT has memory. Claude has projects. Cursor has rules. These are useful inside their own walls but none of them cross boundaries. Your context in ChatGPT does not follow you to Claude. Your Cursor rules do not help when you open Claude Desktop. The agents proliferate, the context stays siloed, and the re-explanation tax compounds.
The missing piece is a memory layer that is independent of any single model. Not scratch paper, not frozen weights, not platform-owned. A store of facts, decisions, preferences, and context that any agent can read and write through a standard interface.
This is what ctxstore does. You paste your API key into Claude Desktop, Cursor, VS Code, or any MCP-compatible client. Your agent gets a small set of tools: store_fact, search_facts, load_context, resume_session. Facts persist across sessions and across tools. When you open a new Claude chat next week, your agent calls load_context and picks up where you left off. Same context in Cursor. Same context on your phone. One memory. Every model.
The technical pattern underneath is boring on purpose. Qdrant for vector search. Postgres for session state. An MCP server that exposes the tools. A small API key per tenant. Layered storage so facts you set last year do not crowd out decisions you made yesterday. The design choices are in what we do not do: we do not train on your data, we do not share memory between tenants, we do not require you to pick a frontend. Your agent brings its own intelligence. We just remember for it.
For builders, the implication is direct. If you are shipping agents, assume your users will move between clients. Assume your agents will be rehydrated in sessions you did not originate. Build your context layer as a first-class component, not a side effect of the chat interface you happened to pick. The agents of the next two years are going to be judged on what they remember, not what model they run on.
We are in open beta now. Free tier gets you 1,000 vectors, a thousand queries a day, no credit card. If you want to try it, visit ctxstore.ai and paste the config into your MCP client. The whole setup takes under a minute.
One memory. Every model. The context window was always scratch paper. Time for the real memory layer to show up.