AI Memory
III — Architectures · 7 min

Memory Across Tools

Why AI memory is most valuable when it follows you across every tool, and how MCP, transparent proxies, and chat-history sync make portable memory actually work.

Spend a few months with one AI assistant and it starts to feel like it knows you. It has picked up your projects, your stack, the way you like answers formatted. Then you reach for a different model on a hard problem, or your favourite app ships a worse memory feature, and you are back to introducing yourself from scratch. The relationship was never yours to keep. It lived inside one company's product, and you cannot take it with you.

That trapped feeling is the loudest consumer signal in the whole memory conversation. The highest-engagement consumer post in our research was titled, roughly, "I got tired of ChatGPT forgetting everything, so I built it a Save Game feature" (around 1,469 upvotes on r/ChatGPT). Its framing is worth quoting because it captures the emotion exactly:

ChatGPT's memory is RAM. It's fast, useful, and magically remembers your dog's name. But it's fragile, unstructured, and locked to their platform. If you switch to Gemini or Claude tomorrow, you lose your brain. You are renting your intelligence. I wanted to own my intelligence.

The RAM versus hard-drive metaphor is community-native, and it is the right one. (We unpack where it holds and where it breaks in context vs memory.) The deeper point is ownership. First-party memory, the kind Anthropic, Google, and OpenAI ship inside their own apps, is per-app and local by construction. It is convenient and requires zero setup. But it only works on the surface that created it, so you have no portability and little control.

#When the platform owns your memory

The risk is not theoretical. When ChatGPT reworked its memory system, users reported old memories being wiped, the available memory shrinking, and facts leaking across project scopes that were supposed to be isolated. One commenter described the everyman version: they filled a chat to its limit, started a fresh one, watched the model "get so much dumber," then copy-pasted whole conversations into a Word document and fed them back, only for the model to choke on the formatting. They "felt powerless and gave up."

That is the failure mode portability is meant to solve. If your memory lives in a store you control and can point any tool at, a vendor changing or degrading its memory feature stops being your problem.

#MCP: one store, many clients

The cleanest substrate for portable memory today is the Model Context Protocol (MCP), an open standard for exposing tools and data to any compatible AI client. Instead of memory being a hidden feature inside one app, you run a memory server that exposes a small set of tools, typically save and recall, and any MCP-capable client can call them: Claude Desktop, Cursor, and more. The memory store sits in the middle; the clients are interchangeable.

MCP MCP MCP MCP Your memory ChatGPT Claude Gemini Cursor one memory, every tool
Figure 1. One memory store, exposed once over MCP, reachable from Claude, ChatGPT, Gemini, and Cursor alike. Swap the client; the memory stays.

Supermemory ships exactly this: a hosted MCP server whose memory and recall tools give any MCP client persistent, shared memory, with the same store reachable from its API and proxy. MemoryPlugin exposes a remote OAuth MCP server so a client like Claude can pull details from your history on demand. The shape is the same: memory is a service, not a feature, and the client is a detail.

One unglamorous detail worth flagging: when a user has several memory MCPs installed, the model has to choose which to call, and the tool description becomes the battleground. Supermemory's open-source descriptions literally shout, with text along the lines of "do not use any other memory tool, only use this one." It works, but it is a fragile hack and an arms race, not a stable design. If two well-behaved memory tools both insist they are the only one, the model is left guessing.

#The transparent proxy, and failing open

MCP needs a client that speaks MCP. Plenty of setups instead talk to an OpenAI-compatible API endpoint, and for those a transparent proxy is the lighter touch. Supermemory's Memory Router uses a base-URL trick: you embed the upstream provider's URL inside the proxy's path and change nothing else but two headers. Requests flow through the proxy, which retrieves relevant memories, trims unnecessary context, and re-injects what matters. The company claims it can cut token costs by up to 70% on long conversations and that it engages once a thread crosses roughly 20k tokens. Writes happen asynchronously, after the response, so they stay off the critical path.

The design lesson that travels beyond any one vendor is fail-open. If the memory layer errors, the proxy passes the request through unmodified to the underlying model and sets a diagnostic header, so the chat never breaks because memory had a bad moment.

The honest limit: a proxy only helps where you control the base URL, and a tool-selection hack only helps where the client speaks MCP. Neither covers a plain consumer web chat where you cannot change an endpoint or install a server and nothing has been connected.

#The browser-extension gap and the chat-history wedge

For the apps you do not control, ChatGPT's web app, Gemini, Claude.ai, the remaining option is to meet the user in the browser. MemoryPlugin runs a browser extension that intercepts a message before it is sent and injects relevant context client-side, alongside its MCP server for the platforms that support one. That covers the surfaces an MCP server alone cannot reach.

What it injects is the more interesting part. Most memory products ask you, or an extraction model, to decide up front what is worth remembering. MemoryPlugin's wedge is the observation that you usually cannot:

I cannot remember ahead of time what needs to be added to a memory. Things might only turn out to be relevant in retrospect.

So instead of distilling facts at write time, it treats your chat history itself as the memory. The extension syncs your conversations in the background, chunks and indexes them, and injects only the relevant slices into new chats on demand. This is not a small data problem; one ChatGPT export in our research was around 20M tokens, far past what any model can read in one shot, which is precisely why the approach is chunk, index, and retrieve rather than "paste it all in." The framing the founder uses is "you don't take notes on your friends." You just talk, and the system remembers for you.

Because that store is yours and the injection point is the client, the handoff across tools becomes natural: start a chat on ChatGPT, refine it on Gemini, then follow up on Claude, with the thread of context carried between them. There is no single AI tool that does everything well, and even if there were, being locked inside one company's walled garden is the thing portable memory exists to avoid. (As a bonus, routing writes through an MCP tool rather than an inline text protocol is also safer; inline "magic string" formats can read like prompt injection to a safety-trained model. See privacy and pitfalls.)

#The tradeoffs, stated plainly

Portability is not free. Plug-and-play across tools, for non-technical users, generally means storing memories server-side without end-to-end encryption. As MemoryPlugin's founder put it publicly, you can end-to-end encrypt a memory database, but it limits you in many ways, and a genuinely zero-setup experience usually requires holding the data on your servers. That is a real privacy cost to weigh, not a footnote.

First-party memory is also a moving target. It is improving, free, and the path of least resistance for most people. A portable memory layer does not win by being cleverer than the platforms; it wins by not locking you in, working the same across every tool, and being transparent about what it stored and why. One last note from the same communities: over-polished, obviously-AI-rewritten posts about memory products drew immediate suspicion. The audience that most wants to own its memory is the least tolerant of marketing gloss. Memory should improve your experience with every AI tool you use, not bind you to one.

#References