Skip to content

Chat Memory Search

Plan Metadata

  • Plan type: plan
  • Parent plan: N/A
  • Depends on: N/A
  • Status: documentation

System Intent

  • What is being built: The end-to-end flow that answers a user's chat question by searching three types of memory (episodic, semantic, visual) using a multi-round reasoning agent and a GPU worker. Includes persistent chat threads stored in DynamoDB, a chat history list in the hamburger menu, and paginated message loading.
  • Primary consumer(s): Chat screen in the mobile app (chat.tsx), side menu (side-menu.tsx)
  • Boundary: User submits a natural-language question → system retrieves relevant memories across three retrieval backends → GPU LLM synthesizes an answer → answer and messages stored in DynamoDB → answer returned to the app. Users can browse past chats from the hamburger menu and load previous messages with pagination.

Stage Gate Tracker

  • [x] Stage 1 Mermaid approved
  • [x] Stage 2 I/O contracts approved
  • [x] Stage 3 pseudocode/technical details approved

1. Mermaid Diagram

flowchart TD
    subgraph APP["App"]
        CHAT["chat.tsx\napp/app/chat.tsx"]:::done
        SIDE_MENU["SideMenu\ncomponents/side-menu.tsx"]:::unchanged
        CWM["chatWithMemory.ts\nlib/api/memory/chatWithMemory.ts"]:::done
        CWMV["chatWithMemoryVerbose.ts\nlib/api/memory/chatWithMemoryVerbose.ts"]:::done
        FETCH_CHATS["fetchChats.ts\nlib/api/chats/fetchChats.ts"]:::unchanged
        USE_CHATS["useChatsApi.ts\nlib/api/chats/useChatsApi.ts"]:::done
        GET_MSGS["getChatMessages.ts\nlib/api/chats/getChatMessages.ts"]:::done
    end

    subgraph LAMBDA["Backend Lambda"]
        CHAT_LAMBDA["chat — app.py\napi/memories/chat/app.py"]:::done
        AGENT["ReasoningAgent\nretrieval/agent.py"]:::unchanged
        DB_LOADER["db_loader.py\nretrieval/db_loader.py"]:::unchanged
        CHATS_LAMBDA["chats_list — app.py\napi/chats/list/app.py"]:::done
        MSGS_LAMBDA["chat_messages — app.py\napi/chats/messages/app.py"]:::done
    end

    subgraph CHAT_STORE["Chat Store"]
        CHAT_REPO["ChatRepository\nchat/repository.py"]:::done
        DDB[("DynamoDB\nMessages: PK=chat_id SK=ulid\nChats: PK=user_id SK=chat_id")]:::unchanged
    end

    subgraph RETRIEVERS["Retrievers"]
        EP["episodic_retriever.py\nretrieval/episodic_retriever.py"]:::unchanged
        SEM["semantic_retriever.py\nretrieval/semantic_retriever.py"]:::unchanged
        VIS["visual_retriever.py\nretrieval/visual_retriever.py"]:::unchanged
    end

    subgraph EMBEDDINGS["Embedders and LLM"]
        EMB["TextEmbedder\nmemory/text_embedder.py"]:::unchanged
        VLM["VLM2VecClient\nmemory/visual/encoder.py"]:::unchanged
        LLM["GPULLMClient\nllm/client.py"]:::unchanged
    end

    subgraph PGDB["PostgreSQL + pgvector"]
        ORM["worldmm_orm.py\nshared/orm/worldmm_orm.py"]:::unchanged
        PG[("PostgreSQL + pgvector")]:::unchanged
    end

    subgraph GPU["GPU Worker EC2"]
        GPU_SRV["FastAPI server\ngpu_worker/server.py"]:::unchanged
    end

    SIDE_MENU -->|"chat list + pagination"| USE_CHATS
    SIDE_MENU -->|"selected chatId"| CHAT
    USE_CHATS -->|"useChatsFeed paginated fetch"| FETCH_CHATS
    USE_CHATS -->|"useChatMessages paginated fetch"| GET_MSGS
    FETCH_CHATS -->|"POST /chats/list"| CHATS_LAMBDA

    CHAT -->|"useChatWithMemory mutation"| USE_CHATS
    CHAT -->|"paginated messages"| USE_CHATS

    USE_CHATS -->|"useChatWithMemory"| CWM
    USE_CHATS -->|"useChatWithMemory verbose"| CWMV

    GET_MSGS -->|"GET /chats/messages?chat_id and cursor"| MSGS_LAMBDA

    CWM -->|"POST /memories/chat"| CHAT_LAMBDA
    CWMV -->|"POST /memories/chat/verbose"| CHAT_LAMBDA

    CHAT_LAMBDA -->|"load graphs"| DB_LOADER
    CHAT_LAMBDA -->|"question + context"| AGENT
    CHAT_LAMBDA -->|"user and assistant messages"| CHAT_REPO

    CHATS_LAMBDA -->|"user_id + cursor"| CHAT_REPO
    MSGS_LAMBDA -->|"chat_id + cursor"| CHAT_REPO
    CHAT_REPO -->|"DynamoDB queries"| DDB

    DB_LOADER -->|"ORM queries"| ORM
    ORM -->|"SQL"| PG

    AGENT -->|"search query"| EP
    AGENT -->|"search query"| SEM
    AGENT -->|"search query"| VIS
    AGENT -->|"question + memory"| LLM

    EP -->|"text query"| EMB
    SEM -->|"text query"| EMB
    VIS -->|"video query"| VLM
    EMB -->|"text"| VLM
    LLM -->|"generate prompt"| VLM
    VLM -->|"HTTP encode-text encode-video generate"| GPU_SRV

    SEM -->|"pgvector search"| ORM
    VIS -->|"pgvector search"| ORM

classDef unchanged fill:#d3d3d3,stroke:#666,stroke-width:1px
classDef updated fill:#ffe58a,stroke:#666,stroke-width:1px
classDef deleted fill:#f4a6a6,stroke:#666,stroke-width:1px
classDef created fill:#a8e6a3,stroke:#666,stroke-width:1px
classDef done fill:#b0b0b0,stroke:#666,stroke-width:1px

2. Black-Box Inputs and Outputs

Global Types

ChatMessage {
  message_id: string  (ULID — sort key, encodes timestamp)
  chat_id:    string
  role:       "user" | "assistant"
  content:    string
  created_at: string  (ISO 8601, for display)
}

ChatSummary {
  id:           string  (chat_id)
  title:        string  (first user message, truncated to 64 chars)
  last_message: string
  updated_at:   string (ISO 8601)
}

Flow: chatWithMemory (updated)

  • Test files: main/app/__tests__/chat-with-memory.test.ts

Request (POST /memories/chat)

{
  question: string   (required, non-empty)
  chat_id?: string   (optional UUID — omit to start a new chat)
}

Response

{ answer: string, chat_id: string }

path-name input output path-type updated
chat.new valid question, no chat_id { answer, chat_id }, messages saved to DynamoDB happy path
chat.existing valid question + owned chat_id { answer, chat_id }, messages saved to DynamoDB happy path
chat.unauthorized valid question + unowned chat_id 403 error
chat.gpu-starting any valid question { answer: "Chat is not available…", chat_id: null }, no DB write subpath
chat.missing-question question="" 400 error
chat.unauthenticated no JWT 401 error

Side effect: on success, lambda saves two ChatMessage rows to DynamoDB — one role=user, one role=assistant — and upserts the Chats table entry for ownership.


Flow: chatWithMemoryVerbose (updated)

  • Test files: main/app/__tests__/chat-with-memory-verbose.test.ts

Request (POST /memories/chat/verbose)

{
  question: string   (required, non-empty)
  chat_id?: string   (optional UUID — omit to start a new chat)
}

Response

{ answer: string, chat_id: string, trace: Trace }

path-name input output path-type updated
chatVerbose.new valid question, no chat_id { answer, chat_id, trace }, messages saved to DynamoDB happy path
chatVerbose.existing valid question + owned chat_id { answer, chat_id, trace }, messages saved to DynamoDB happy path
chatVerbose.unauthorized valid question + unowned chat_id 403 error

Flow: listChats (new backend — frontend already exists)

  • Test files: main/server/tests/unit/test_chats_list.py

Request (POST /chats/list)

{
  cursor?: string | null   (DynamoDB LastEvaluatedKey, base64 encoded)
  limit?:  number          (default 20, max 50)
}

Response

{
  conversations: ChatSummary[]
  next_cursor:   string | null
}

path-name input output path-type updated
listChats.success valid JWT paginated ChatSummary[] for authenticated user happy path
listChats.empty valid JWT, no chats { conversations: [], next_cursor: null } subpath
listChats.unauthenticated no JWT 401 error

Security: user_id from JWT is the only filter — users can only ever see their own chats.


Flow: getChatMessages (new)

  • Test files: main/server/tests/unit/test_chat_messages.py

Request (GET /chats/messages?chat_id=&cursor=&limit=)

{
  chat_id: string  (required)
  cursor?: string | null  (ULID of last fetched message — ExclusiveStartKey)
  limit?:  number  (default 20, max 50)
}

Response

{
  messages:    ChatMessage[]  (newest-first)
  next_cursor: string | null
}

path-name input output path-type updated
getMessages.success valid JWT + owned chat_id paginated ChatMessage[] newest-first happy path
getMessages.unauthorized valid JWT + chat_id not owned by user 403 error
getMessages.not-found valid JWT + unknown chat_id 403 (same as unauthorized — don't leak existence) error
getMessages.unauthenticated no JWT 401 error

Security: Lambda checks Chats table for (user_id, chat_id) pair before reading Messages. Returns 403 (not 404) to avoid leaking whether a chat exists.

3. Technical Details

Security Invariants

  • user_id always sourced from JWT auth context — never trusted from request body
  • message_id (ULID) always generated server-side — never accepted from client
  • Ownership check required on both read and write when chat_id is provided
  • All chat/message endpoints return 403 (never 404) for unowned or unknown chat_id — do not leak existence
  • Chats table acts as the ownership registry — (user_id, chat_id) must exist before Messages table is touched

Pseudocode: chat/app.py — handle_chat()

handle_chat(user_id, question, chat_id=None):
    is_new_chat = chat_id is None

    if not is_new_chat:
        # ownership check — 403 if not found
        if not repo.chat_owned_by(user_id, chat_id):
            raise Forbidden()

    # run reasoning agent (unchanged)
    answer, trace = agent.answer(question)

    # persist — repo handles ULID generation + dual-table write
    if is_new_chat:
        chat_id = repo.create_chat(user_id)
    repo.save_message(chat_id, role="user",      content=question)
    repo.save_message(chat_id, role="assistant", content=answer)

    return {"answer": answer, "chat_id": chat_id}

Pseudocode: ChatRepository

create_chat(user_id) -> chat_id:
    chat_id = uuid4()
    chats_table.put_item({
        PK: user_id,
        SK: chat_id,
        last_activity: now_iso(),
    })
    return chat_id

chat_owned_by(user_id, chat_id) -> bool:
    item = chats_table.get_item(PK=user_id, SK=chat_id)
    return item is not None

save_message(chat_id, role, content):
    messages_table.put_item({
        PK: chat_id,
        SK: ulid(),          # monotonically increasing, encodes timestamp
        role: role,
        content: content,
        created_at: now_iso(),
    })

get_messages(user_id, chat_id, cursor=None, limit=20) -> (messages, next_cursor):
    if not chat_owned_by(user_id, chat_id):
        raise Forbidden()
    result = messages_table.query(
        PK=chat_id,
        ScanIndexForward=False,   # newest first
        Limit=limit,
        ExclusiveStartKey=cursor,
    )
    return result.items, result.LastEvaluatedKey

list_chats(user_id, cursor=None, limit=20) -> (chats, next_cursor):
    result = chats_table.query(
        PK=user_id,
        ScanIndexForward=False,
        Limit=limit,
        ExclusiveStartKey=cursor,
    )
    return result.items, result.LastEvaluatedKey

Implementation Checklist

Frontend - [x] app/app/chat.tsx — add chatId state (null on new chat, set from first response or from route params on existing), pass to useChatsApi - [x] lib/api/memory/chatWithMemory.ts — add optional chat_id to request body; handle chat_id in response for new chats - [x] lib/api/memory/chatWithMemoryVerbose.ts — add optional chat_id to request body; handle chat_id in response for new chats - [x] lib/api/chats/useChatsApi.ts — add useChatWithMemory mutation + useChatMessages infinite query - [x] lib/api/chats/getChatMessages.ts — new fetch function for GET /chats/messages

Backend - [x] api/memories/chat/app.py — accept chat_id, save user + assistant messages to DynamoDB via ChatRepository - [x] api/memories/chat_verbose/app.py — accept chat_id, same DynamoDB save - [x] api/chats/list/app.py — new lambda, list chats for authenticated user - [x] api/chats/messages/app.py — new lambda, get messages with ownership check + pagination - [x] chat/repository.py — new ChatRepository wrapping boto3 DynamoDB calls

Infrastructure - [ ] DynamoDB Messages table — PK: chat_id, SK: ulid - [ ] DynamoDB Chats table — PK: user_id, SK: chat_id - [ ] Wire new lambdas into API Gateway + Terraform