Memory that forgets to tell anyone

“I mentioned that last week.” It’s the kind of thing you’d say to a friend, and increasingly the kind of thing you expect from an AI assistant. Memory – the ability to carry context across conversations – is what turns a chatbot into something that feels like it actually knows you.

The best version of this is an assistant that knows everything about you but tells no one. That’s not how any AI product works today. The way they remember you is by building a profile that sits in plaintext on someone else’s server. Confer now has memory, and it works differently.

How memory typically works

To understand what’s different, it helps to know how memory usually works. At a high level, memory systems have two phases: extraction and retrieval.

Extraction happens after each conversation. The system reviews the exchange and decides whether anything worth remembering was said. “The user is a software engineer in Seattle” passes the test. “The user asked about sorting algorithms” does not – it’s a task, not a fact about the person.

Each extracted fact is turned into an embedding – a vector of numbers that captures the semantic meaning of the text. The fact and its embedding are stored together.

Retrieval happens at the start of each new message. The user’s prompt is embedded, and the system searches stored memories for ones that are semantically close to what the user is asking about. The relevant memories are injected into the conversation so the model has context.

This is the basic loop: extract facts, embed them, store them, retrieve them when relevant. It makes the assistant genuinely more useful. And in most implementations, every part of this loop is visible to the service operator in plaintext.

Encrypting the loop

Confer’s memory works the same way – except it’s private to you. Instead of doing it on the server, every part of this pipeline is managed by the client.

Here’s how each piece works:

Extraction - The client extracts facts from a conversation and embeds them as vectors, all over the same encrypted channel used for inference.

Storage - Uses the same passkey-derived encryption as everything else in Confer. Each memory is encrypted before it leaves the client, so the server only stores opaque blobs.

Retrieval - When you type a message, the client embeds your query, then searches your local decrypted memories using a hybrid scoring algorithm – cosine similarity for semantic relevance and BM25 for keyword matching. The top results are injected into your prompt before it’s sent for inference.

The server never touches any of this. It stores encrypted blobs. When you load your memories on a new device, it hands back ciphertext. Your client decrypts everything locally.

An assistant that knows you, a service that doesn’t

Confer’s memory remembers your name, your preferences, what you’re working on, how you like to communicate. We don’t know any of this.

This is the same principle behind everything we’ve built at Confer. Your conversations are encrypted so that we can’t read them. Your folders are encrypted blobs with sort keys hidden inside. And now your memory – the most personal data an AI assistant produces – is encrypted too.

Give it a try!