Memory Fetch and Display
Plan Metadata
- Plan type:
plan - Parent plan: N/A
- Depends on: N/A
- Status:
documentation
System Intent
- What is being built: The end-to-end flow that stores processed memory segments in PostgreSQL, serves them through a paginated feed API with presigned S3 image URLs, and renders them as a 2-column photo grid with a full-screen viewer in the mobile app.
- Primary consumer(s): Home screen (
app/app/index.tsx),MemoryFeed,MemoryCard,MemoryViewerModalcomponents. - Boundary: WorldMM pipeline writes memory segments (captions, frame S3 keys, knowledge graph triples, visual embeddings) to PostgreSQL → feed API paginates segments and generates presigned S3 URLs → mobile app fetches via infinite query → renders grid → user taps to open full-screen viewer.
Stage Gate Tracker
- [x] Stage 1 Mermaid approved
- [x] Stage 2 I/O contracts approved
- [x] Stage 3 pseudocode/technical details approved
Revision: renamed
url→thumbnail(now optional), addedtype: "text" | "audio" | "visual"field toMemoryFeedItem.
1. Mermaid Diagram
flowchart TD
subgraph PIPELINE["WorldMM Ingest Pipeline"]
GPU["GPU Worker EC2\ngpu_worker/server.py"]:::unchanged
INGEST["ingest_window.py\nworldmm/pipeline/ingest_window.py"]:::unchanged
end
subgraph DB["PostgreSQL + pgvector"]
SEG[("worldmm_segments\nid, user_id, start_time, end_time,\ncaption, s3_frames_key, transcript,\nsource_session_id, source_window_index")]:::unchanged
ENT[("worldmm_entities\nid, user_id, surface_form,\ncanonical_name, embedding_json")]:::unchanged
TRP[("worldmm_triples\nid, segment_id, user_id,\nmemory_type, subject, predicate,\nobject, invalidated_at")]:::unchanged
EMB[("worldmm_visual_embeddings\nid, segment_id, user_id,\ntimestamp, embedding_json")]:::unchanged
ORM["worldmm_orm.py\nshared/orm/worldmm_orm.py"]:::unchanged
end
subgraph S3["S3"]
FRAMES[("Frame images\nsessions/session_id/window_NNN/frame_000.jpg")]:::unchanged
end
subgraph FEED_API["Feed Lambda"]
FEED["feed app.py\napi/memories/feed/app.py"]:::unchanged
end
subgraph APP["Mobile App"]
HOOK["useMemoriesFeed\nlib/api/memory/useMemoryApi.ts"]:::unchanged
LIST["listMemories\nlib/api/memory/listMemories.ts"]:::unchanged
SCREEN["index.tsx\napp/app/index.tsx"]:::unchanged
MFEED["MemoryFeed\ncomponents/memory/MemoryFeed.tsx"]:::unchanged
MCARD["MemoryCard\ncomponents/memory/MemoryCard.tsx"]:::unchanged
MROW["MemoryRow\ncomponents/memory/MemoryRow.tsx"]:::unchanged
MODAL["MemoryViewerModal\ncomponents/memory/memory-viewer-modal.tsx"]:::unchanged
end
GPU -->|"caption, NER entities, triples, 1536-dim visual embedding"| INGEST
INGEST -->|"create_segment, create_entity, create_triple"| ORM
INGEST -->|"store_visual_embedding"| ORM
ORM -->|"SQL INSERT"| SEG
ORM -->|"SQL INSERT"| ENT
ORM -->|"SQL INSERT"| TRP
ORM -->|"SQL INSERT"| EMB
INGEST -->|"frame_000.jpg at s3_frames_key"| FRAMES
SCREEN -->|"infinite query"| HOOK
HOOK -->|"cursor + limit"| LIST
LIST -->|"POST /memories/feed"| FEED
FEED -->|"SELECT worldmm_segments ORDER BY start_time DESC LIMIT n+1"| ORM
ORM -->|"segment rows"| FEED
FEED -->|"presign s3_frames_key"| S3
S3 -->|"presigned URL string"| FEED
FEED -->|"MemoryFeedItem[] + next_cursor"| LIST
LIST -->|"ListMemoriesResponse"| HOOK
HOOK -->|"flattened pages of MemoryFeedItem[]"| SCREEN
SCREEN -->|"memories + pagination callbacks"| MFEED
MFEED -->|"type=visual MemoryFeedItem"| MCARD
MFEED -->|"type=audio or text MemoryFeedItem"| MROW
MCARD -->|"onPress index"| SCREEN
MROW -->|"onPress index"| SCREEN
SCREEN -->|"memories + initialIndex"| MODAL
classDef unchanged fill:#d3d3d3,stroke:#666,stroke-width:1px
classDef updated fill:#ffe58a,stroke:#666,stroke-width:1px
classDef deleted fill:#f4a6a6,stroke:#666,stroke-width:1px
classDef created fill:#a8e6a3,stroke:#666,stroke-width:1px 2. Black-Box Inputs and Outputs
Global Types
MemoryFeedItem {
id: string (UUID — worldmm_segments PK)
time: string (ISO 8601 — segment start_time)
type: "text" | "audio" | "visual" (NEW — derived from segment content; see type derivation rules below)
thumbnail?: string (CHANGED from url — presigned S3 URL, 1hr TTL; omitted if no frame stored)
featured: boolean (always false currently)
}
ListMemoriesResponse {
memories: MemoryFeedItem[]
next_cursor: string | null (string offset; null = no more pages)
}
Type derivation rules (server-side, no new DB columns needed):
| condition | type value |
|---|---|
s3_frames_key is non-null | "visual" |
s3_frames_key is null and transcript is non-null | "audio" |
s3_frames_key is null and transcript is null | "text" |
Pipeline Output: what the ingest pipeline writes
The pipeline is a black box to the feed system. Its only observable output is rows in these four tables:
worldmm_segments — one row per 30-second window
| column | type | notes |
|---|---|---|
id | string UUID | PK |
user_id | string | owner |
start_time | string ISO 8601 | window start |
end_time | string ISO 8601 | window end |
duration_seconds | int | always 30 |
caption | text | null | GPU-generated description of the window |
s3_frames_key | string | null | S3 path to representative frame (sessions/{session_id}/window_{index:03d}/frame_000.jpg) |
transcript | text | null | speech-to-text from window audio |
source_session_id | string | null | originating session ID |
source_window_index | int | null | 0-based index within session |
Unique constraint: (user_id, source_session_id, source_window_index) — prevents duplicate ingestion.
worldmm_entities — knowledge graph nodes extracted from captions
| column | type | notes |
|---|---|---|
id | string UUID | PK |
user_id | string | owner |
surface_form | string | raw text from caption (e.g. "John") |
canonical_name | string | null | normalized form (e.g. "John Smith") |
embedding_json | text | null | 1536-dim vector as JSON |
worldmm_triples — episodic and semantic facts
| column | type | notes |
|---|---|---|
id | string UUID | PK |
segment_id | string UUID FK | links to worldmm_segments |
user_id | string | owner |
memory_type | string | "episodic" or "semantic" |
subject_entity_id | string UUID FK | subject entity |
predicate | string | relation (e.g. "is_doing", "is_at") |
object_entity_id | string UUID FK | null | object as entity (mutually exclusive with literal) |
object_literal | string | null | object as literal value (mutually exclusive with entity) |
invalidated_at | string ISO 8601 | null | soft-delete; NULL = active |
worldmm_visual_embeddings — per-window semantic search index
| column | type | notes |
|---|---|---|
id | string UUID | PK |
segment_id | string UUID FK | links to worldmm_segments |
user_id | string | owner |
timestamp | string ISO 8601 | frame capture time |
embedding_json | text | null | 1536-dim vector as JSON (pgvector on PostgreSQL) |
Flow: listMemories — paginated memory feed
- Test files:
main/server/tests/integration/test_memories_feed_pagination.py
Request (POST /memories/feed)
{
cursor?: string | null (string integer offset; null or omit for first page)
limit?: number (default 20, max 50)
}
Response
| path-name | input | output | path-type | updated |
|---|---|---|---|---|
feed.first-page | valid JWT, no cursor | first limit items newest-first, next_cursor set if more exist | happy path | |
feed.paginated | valid JWT + cursor | next page of items, next_cursor null on last page | happy path | |
feed.empty | valid JWT, no segments | { memories: [], next_cursor: null } | subpath | |
feed.no-thumbnail | segment has no s3_frames_key | item returned without thumbnail field, type = "audio" or "text" | subpath | |
feed.type-visual | segment has s3_frames_key non-null | type: "visual", thumbnail present | subpath | |
feed.type-audio | no s3_frames_key, transcript non-null | type: "audio", no thumbnail | subpath | |
feed.type-text | no s3_frames_key, no transcript | type: "text", no thumbnail | subpath | |
feed.unauthenticated | no JWT | 401 | error | |
feed.user-isolation | valid JWT user A | never returns segments owned by user B | security |
Ordering: DESC(start_time), DESC(id) — newest segments first. Ties broken by segment UUID. Pagination: server fetches limit + 1 rows; if count exceeds limit, sets next_cursor = str(cursor + limit). Presigned URLs: generated per item via boto3 at response time, TTL = 3600s. Null s3_frames_key → thumbnail field omitted entirely.
3. Technical Details
Display: how MemoryFeedItem[] becomes UI
index.tsx
└─ useMemoriesFeed({ limit: 20 }) ← React Query infinite query
└─ listMemories({ cursor, limit }) ← POST /memories/feed
└─ sortedMemories ← client-side sort newest-first over all pages
└─ <MemoryFeed memories={sortedMemories}> ← mixed layout FlatList
└─ type = "visual" → <MemoryCard> ← tile in 2-column grid, renders memory.thumbnail
└─ type = "audio" → <MemoryRow> ← full-width row, shows transcript/audio indicator
└─ type = "text" → <MemoryRow> ← full-width row, shows caption text
onPress(index) → setViewerIndex(index)
└─ <MemoryViewerModal
memories={sortedMemories}
initialIndex={viewerIndex}> ← horizontal paged FlatList, full-screen
image source = memory.thumbnail
fallback = placeholder if thumbnail absent (type = "text" or "audio")
MemoryFeed layout rules: - FlatList with mixed layout — render mode determined by type field - type: "visual" → <MemoryCard> tile in 2-column grid (same as before) - type: "audio" → <MemoryRow> full-width row showing transcript or audio indicator - type: "text" → <MemoryRow> full-width row showing caption text - featured: true visual items span full width (currently unused) - Infinite scroll: onEndReachedThreshold={0.4} triggers fetchNextPage() - Pull-to-refresh: calls refetch()
MemoryViewerModal: - Horizontal FlatList with pagingEnabled - Full-screen image from memory.thumbnail (absent for text/audio type — show placeholder) - Close button with safe-area inset
Security Invariants
user_idalways sourced from JWT — never trusted from request body- Feed query always filters by
user_id— no cross-user data leakage - Presigned URLs are time-scoped (1hr) and generated server-side per request