Skip to content
Free Tool Arena

AI & LLMs · Guide · AI & Prompt Tools

How to Use Letta (MemGPT)

Installing Letta, server vs cloud, ADE visual tool, building agents with long-term memory, archival storage.

Updated April 2026 · 6 min read

Letta (formerly MemGPT) is an open-source framework for stateful agents — LLMs that manage their own long-term memory across conversations.

Advertisement

MemGPT started as a Berkeley research project that gave LLMs an operating-system-style memory hierarchy: a small in-context working set, a larger archival store, and tools to page between them. It rebranded as Letta in 2024 and now ships a server, a Python/TypeScript SDK, and the Agent Development Environment (ADE) — a visual debugger for stateful agents.

What it is

Letta runs a persistent server that owns agent state: core memory blocks (persona, human), archival memory (vector store), and message history. Agents are addressable by ID and survive restarts. You talk to them over REST or WebSocket; they call tools, update their own memory blocks, and keep learning across sessions.

Install / sign up

# Docker (recommended)
docker run -it -p 8283:8283 \
  -v ~/.letta:/root/.letta \
  letta/letta:latest

# Or pip
pip install letta
letta server

# Cloud option: https://app.letta.com (managed)

First session

Open the ADE at http://localhost:8283, create an agent, and start chatting. Watch the memory panel on the right — when you mention your name, you’ll see the agent update its “human” block in real time.

$ letta run
> Hi, I'm Jay and I build SEO tools.
# agent writes to core memory:
#   human: "Name is Jay. Builds SEO tools."
> What do I work on?
# agent recalls from core memory, not context window

Everyday workflows

  • 1. Build a personal assistant that remembers preferences across weeks of chats.
  • 2. Give agents tools (Python functions or MCP servers) so they can act, not just remember.
  • 3. Use the ADE to inspect memory edits and step through tool calls when debugging.

Gotchas and tips

Archival memory uses a vector store (pgvector by default) — point it at a durable Postgres in production, not the in-container SQLite, or you’ll lose memories on restart. Letta supports any OpenAI-compatible endpoint, so local models via Ollama or vLLM work fine for privacy-sensitive deployments.

Core memory blocks are small (a few KB) on purpose — they’re always in context. Push larger facts into archival and let the agent retrieve them. The agent’s self-editing of memory is powerful but occasionally overwrites useful info; version your memory blocks via the API if that matters.

Who it’s for

Builders of long-lived assistants, companion apps, customer-support bots, and any product where “the agent remembers you” is the core value prop.

Advertisement

Found this useful?Email