Chat Memory Search
Plan Metadata
- Plan type:
plan - Parent plan: N/A
- Depends on: N/A
- Status:
documentation
System Intent
- What is being built: The end-to-end flow that answers a user's chat question by searching three types of memory (episodic, semantic, visual) using a multi-round reasoning agent and a GPU worker. Includes persistent chat threads stored in DynamoDB, a chat history list in the hamburger menu, and paginated message loading.
- Primary consumer(s): Chat screen in the mobile app (
chat.tsx), side menu (side-menu.tsx) - Boundary: User submits a natural-language question → system retrieves relevant memories across three retrieval backends → GPU LLM synthesizes an answer → answer and messages stored in DynamoDB → answer returned to the app. Users can browse past chats from the hamburger menu and load previous messages with pagination.
Stage Gate Tracker
- [x] Stage 1 Mermaid approved
- [x] Stage 2 I/O contracts approved
- [x] Stage 3 pseudocode/technical details approved
1. Mermaid Diagram
flowchart TD
subgraph APP["App"]
CHAT["chat.tsx\napp/app/chat.tsx"]:::done
SIDE_MENU["SideMenu\ncomponents/side-menu.tsx"]:::unchanged
CWM["chatWithMemory.ts\nlib/api/memory/chatWithMemory.ts"]:::done
CWMV["chatWithMemoryVerbose.ts\nlib/api/memory/chatWithMemoryVerbose.ts"]:::done
FETCH_CHATS["fetchChats.ts\nlib/api/chats/fetchChats.ts"]:::unchanged
USE_CHATS["useChatsApi.ts\nlib/api/chats/useChatsApi.ts"]:::done
GET_MSGS["getChatMessages.ts\nlib/api/chats/getChatMessages.ts"]:::done
end
subgraph LAMBDA["Backend Lambda"]
CHAT_LAMBDA["chat — app.py\napi/memories/chat/app.py"]:::done
AGENT["ReasoningAgent\nretrieval/agent.py"]:::unchanged
DB_LOADER["db_loader.py\nretrieval/db_loader.py"]:::unchanged
CHATS_LAMBDA["chats_list — app.py\napi/chats/list/app.py"]:::done
MSGS_LAMBDA["chat_messages — app.py\napi/chats/messages/app.py"]:::done
end
subgraph CHAT_STORE["Chat Store"]
CHAT_REPO["ChatRepository\nchat/repository.py"]:::done
DDB[("DynamoDB\nMessages: PK=chat_id SK=ulid\nChats: PK=user_id SK=chat_id")]:::unchanged
end
subgraph RETRIEVERS["Retrievers"]
EP["episodic_retriever.py\nretrieval/episodic_retriever.py"]:::unchanged
SEM["semantic_retriever.py\nretrieval/semantic_retriever.py"]:::unchanged
VIS["visual_retriever.py\nretrieval/visual_retriever.py"]:::unchanged
end
subgraph EMBEDDINGS["Embedders and LLM"]
EMB["TextEmbedder\nmemory/text_embedder.py"]:::unchanged
VLM["VLM2VecClient\nmemory/visual/encoder.py"]:::unchanged
LLM["GPULLMClient\nllm/client.py"]:::unchanged
end
subgraph PGDB["PostgreSQL + pgvector"]
ORM["worldmm_orm.py\nshared/orm/worldmm_orm.py"]:::unchanged
PG[("PostgreSQL + pgvector")]:::unchanged
end
subgraph GPU["GPU Worker EC2"]
GPU_SRV["FastAPI server\ngpu_worker/server.py"]:::unchanged
end
SIDE_MENU -->|"chat list + pagination"| USE_CHATS
SIDE_MENU -->|"selected chatId"| CHAT
USE_CHATS -->|"useChatsFeed paginated fetch"| FETCH_CHATS
USE_CHATS -->|"useChatMessages paginated fetch"| GET_MSGS
FETCH_CHATS -->|"POST /chats/list"| CHATS_LAMBDA
CHAT -->|"useChatWithMemory mutation"| USE_CHATS
CHAT -->|"paginated messages"| USE_CHATS
USE_CHATS -->|"useChatWithMemory"| CWM
USE_CHATS -->|"useChatWithMemory verbose"| CWMV
GET_MSGS -->|"GET /chats/messages?chat_id and cursor"| MSGS_LAMBDA
CWM -->|"POST /memories/chat"| CHAT_LAMBDA
CWMV -->|"POST /memories/chat/verbose"| CHAT_LAMBDA
CHAT_LAMBDA -->|"load graphs"| DB_LOADER
CHAT_LAMBDA -->|"question + context"| AGENT
CHAT_LAMBDA -->|"user and assistant messages"| CHAT_REPO
CHATS_LAMBDA -->|"user_id + cursor"| CHAT_REPO
MSGS_LAMBDA -->|"chat_id + cursor"| CHAT_REPO
CHAT_REPO -->|"DynamoDB queries"| DDB
DB_LOADER -->|"ORM queries"| ORM
ORM -->|"SQL"| PG
AGENT -->|"search query"| EP
AGENT -->|"search query"| SEM
AGENT -->|"search query"| VIS
AGENT -->|"question + memory"| LLM
EP -->|"text query"| EMB
SEM -->|"text query"| EMB
VIS -->|"video query"| VLM
EMB -->|"text"| VLM
LLM -->|"generate prompt"| VLM
VLM -->|"HTTP encode-text encode-video generate"| GPU_SRV
SEM -->|"pgvector search"| ORM
VIS -->|"pgvector search"| ORM
classDef unchanged fill:#d3d3d3,stroke:#666,stroke-width:1px
classDef updated fill:#ffe58a,stroke:#666,stroke-width:1px
classDef deleted fill:#f4a6a6,stroke:#666,stroke-width:1px
classDef created fill:#a8e6a3,stroke:#666,stroke-width:1px
classDef done fill:#b0b0b0,stroke:#666,stroke-width:1px 2. Black-Box Inputs and Outputs
Global Types
ChatMessage {
message_id: string (ULID — sort key, encodes timestamp)
chat_id: string
role: "user" | "assistant"
content: string
created_at: string (ISO 8601, for display)
}
ChatSummary {
id: string (chat_id)
title: string (first user message, truncated to 64 chars)
last_message: string
updated_at: string (ISO 8601)
}
Flow: chatWithMemory (updated)
- Test files:
main/app/__tests__/chat-with-memory.test.ts
Request (POST /memories/chat)
{
question: string (required, non-empty)
chat_id?: string (optional UUID — omit to start a new chat)
}
Response
| path-name | input | output | path-type | updated |
|---|---|---|---|---|
chat.new | valid question, no chat_id | { answer, chat_id }, messages saved to DynamoDB | happy path | |
chat.existing | valid question + owned chat_id | { answer, chat_id }, messages saved to DynamoDB | happy path | |
chat.unauthorized | valid question + unowned chat_id | 403 | error | |
chat.gpu-starting | any valid question | { answer: "Chat is not available…", chat_id: null }, no DB write | subpath | |
chat.missing-question | question="" | 400 | error | |
chat.unauthenticated | no JWT | 401 | error |
Side effect: on success, lambda saves two ChatMessage rows to DynamoDB — one role=user, one role=assistant — and upserts the Chats table entry for ownership.
Flow: chatWithMemoryVerbose (updated)
- Test files:
main/app/__tests__/chat-with-memory-verbose.test.ts
Request (POST /memories/chat/verbose)
{
question: string (required, non-empty)
chat_id?: string (optional UUID — omit to start a new chat)
}
Response
| path-name | input | output | path-type | updated |
|---|---|---|---|---|
chatVerbose.new | valid question, no chat_id | { answer, chat_id, trace }, messages saved to DynamoDB | happy path | |
chatVerbose.existing | valid question + owned chat_id | { answer, chat_id, trace }, messages saved to DynamoDB | happy path | |
chatVerbose.unauthorized | valid question + unowned chat_id | 403 | error |
Flow: listChats (new backend — frontend already exists)
- Test files:
main/server/tests/unit/test_chats_list.py
Request (POST /chats/list)
{
cursor?: string | null (DynamoDB LastEvaluatedKey, base64 encoded)
limit?: number (default 20, max 50)
}
Response
| path-name | input | output | path-type | updated |
|---|---|---|---|---|
listChats.success | valid JWT | paginated ChatSummary[] for authenticated user | happy path | |
listChats.empty | valid JWT, no chats | { conversations: [], next_cursor: null } | subpath | |
listChats.unauthenticated | no JWT | 401 | error |
Security: user_id from JWT is the only filter — users can only ever see their own chats.
Flow: getChatMessages (new)
- Test files:
main/server/tests/unit/test_chat_messages.py
Request (GET /chats/messages?chat_id=&cursor=&limit=)
{
chat_id: string (required)
cursor?: string | null (ULID of last fetched message — ExclusiveStartKey)
limit?: number (default 20, max 50)
}
Response
| path-name | input | output | path-type | updated |
|---|---|---|---|---|
getMessages.success | valid JWT + owned chat_id | paginated ChatMessage[] newest-first | happy path | |
getMessages.unauthorized | valid JWT + chat_id not owned by user | 403 | error | |
getMessages.not-found | valid JWT + unknown chat_id | 403 (same as unauthorized — don't leak existence) | error | |
getMessages.unauthenticated | no JWT | 401 | error |
Security: Lambda checks Chats table for (user_id, chat_id) pair before reading Messages. Returns 403 (not 404) to avoid leaking whether a chat exists.
3. Technical Details
Security Invariants
user_idalways sourced from JWT auth context — never trusted from request bodymessage_id(ULID) always generated server-side — never accepted from client- Ownership check required on both read and write when
chat_idis provided - All chat/message endpoints return 403 (never 404) for unowned or unknown
chat_id— do not leak existence - Chats table acts as the ownership registry —
(user_id, chat_id)must exist before Messages table is touched
Pseudocode: chat/app.py — handle_chat()
handle_chat(user_id, question, chat_id=None):
is_new_chat = chat_id is None
if not is_new_chat:
# ownership check — 403 if not found
if not repo.chat_owned_by(user_id, chat_id):
raise Forbidden()
# run reasoning agent (unchanged)
answer, trace = agent.answer(question)
# persist — repo handles ULID generation + dual-table write
if is_new_chat:
chat_id = repo.create_chat(user_id)
repo.save_message(chat_id, role="user", content=question)
repo.save_message(chat_id, role="assistant", content=answer)
return {"answer": answer, "chat_id": chat_id}
Pseudocode: ChatRepository
create_chat(user_id) -> chat_id:
chat_id = uuid4()
chats_table.put_item({
PK: user_id,
SK: chat_id,
last_activity: now_iso(),
})
return chat_id
chat_owned_by(user_id, chat_id) -> bool:
item = chats_table.get_item(PK=user_id, SK=chat_id)
return item is not None
save_message(chat_id, role, content):
messages_table.put_item({
PK: chat_id,
SK: ulid(), # monotonically increasing, encodes timestamp
role: role,
content: content,
created_at: now_iso(),
})
get_messages(user_id, chat_id, cursor=None, limit=20) -> (messages, next_cursor):
if not chat_owned_by(user_id, chat_id):
raise Forbidden()
result = messages_table.query(
PK=chat_id,
ScanIndexForward=False, # newest first
Limit=limit,
ExclusiveStartKey=cursor,
)
return result.items, result.LastEvaluatedKey
list_chats(user_id, cursor=None, limit=20) -> (chats, next_cursor):
result = chats_table.query(
PK=user_id,
ScanIndexForward=False,
Limit=limit,
ExclusiveStartKey=cursor,
)
return result.items, result.LastEvaluatedKey
Implementation Checklist
Frontend - [x] app/app/chat.tsx — add chatId state (null on new chat, set from first response or from route params on existing), pass to useChatsApi - [x] lib/api/memory/chatWithMemory.ts — add optional chat_id to request body; handle chat_id in response for new chats - [x] lib/api/memory/chatWithMemoryVerbose.ts — add optional chat_id to request body; handle chat_id in response for new chats - [x] lib/api/chats/useChatsApi.ts — add useChatWithMemory mutation + useChatMessages infinite query - [x] lib/api/chats/getChatMessages.ts — new fetch function for GET /chats/messages
Backend - [x] api/memories/chat/app.py — accept chat_id, save user + assistant messages to DynamoDB via ChatRepository - [x] api/memories/chat_verbose/app.py — accept chat_id, same DynamoDB save - [x] api/chats/list/app.py — new lambda, list chats for authenticated user - [x] api/chats/messages/app.py — new lambda, get messages with ownership check + pagination - [x] chat/repository.py — new ChatRepository wrapping boto3 DynamoDB calls
Infrastructure - [ ] DynamoDB Messages table — PK: chat_id, SK: ulid - [ ] DynamoDB Chats table — PK: user_id, SK: chat_id - [ ] Wire new lambdas into API Gateway + Terraform