Bug: Memory Feed Shows Multiple Tiles for a Single Recording Session
Summary
The memory feed displays one tile per 30-second window segment even though all windows within a session share the same source_session_id. The PR #397 (feature/group-feed-by-session) fixed the feed API grouping logic correctly, but two code paths in the ingest layer write segments WITHOUT source_session_id, causing the feed's COALESCE(source_session_id, CAST(id AS VARCHAR)) group key to fall back to each segment's own UUID — so each segment appears as a separate tile.
Symptoms
- A single 5-minute recording (10 windows) shows 10 tiles in the feed.
- Each tile displays a different thumbnail corresponding to one 30-second window.
- The bug reproduces consistently for sessions ingested via
ingest_session.py(the offline batch pipeline) and any multiscale/semantic synthetic segments. - Sessions ingested via
ingest_window.py(the Lambda path) group correctly because that path does passsource_session_idtocreate_segment.
Root Cause
Two call sites in ingest_session.py call create_segment without source_session_id:
Root Cause 1 — _ingest_window (line 189)
# ingest_session.py:189
segment_id = create_segment(
user_id=user_id,
start_time=start_iso,
end_time=end_iso,
duration_seconds=30,
caption=caption,
transcript=window_transcript or None,
# source_session_id MISSING — defaults to None
)
source_session_id is never passed. Every window segment gets source_session_id=NULL in the DB. The feed's COALESCE falls back to CAST(id AS VARCHAR) for each segment, so each window becomes an independent group key and a separate feed tile.
Root Cause 2 — _run_multiscale (line 372) and _run_semantic_extraction (line 433)
# ingest_session.py:372
create_segment(
user_id=user_id,
start_time=...,
end_time=...,
duration_seconds=output_duration,
caption=merged,
# source_session_id MISSING
)
Merged/semantic segments also omit source_session_id, so they appear as separate tiles as well.
Non-issue confirmed
The ingest_window.py (Lambda) path correctly passes source_session_id=session_id at line 199. The re-delivery (existing) branch at line 177 reuses the existing row (which already has source_session_id) and doesn't need to write it again — this path is correct.
The feed API's session_key_expr.in_(visible_keys) filter is also correct: when a segment has source_session_id=NULL, the COALESCE produces the segment UUID as the key, and that UUID is what goes into visible_keys, so the second query retrieves it correctly (just as a singleton, not grouped).
Reproduction Test
tests/unit/test_ingest_session_missing_source_session_id.py
The test calls _ingest_window with a mock session and verifies that the created WorldMMSegment row has source_session_id set to the session ID (not NULL). It fails before the fix because create_segment is called without the argument.
Fix
Pass source_session_id in every create_segment call inside ingest_session.py:
_ingest_window: addsource_session_idparameter to the function signature and thread it through tocreate_segment.ingest_sessioncaller: passsession_idfrom the metadata dict._run_multiscale: addsource_session_idparameter and pass it through._run_semantic_extraction: addsource_session_idparameter and pass it through.
Evidence
ingest_session.py:189—create_segmentcall missingsource_session_id.ingest_session.py:372—create_segmentcall missingsource_session_id.ingest_session.py:433—create_segmentcall missingsource_session_id.ingest_window.py:199—create_segmentcorrectly passessource_session_id.- Feed API
app.py:157— COALESCE grouping is correct.
Verification
After the fix: - test_ingest_session_missing_source_session_id.py passes. - Existing test_memories_feed_session_grouping.py continues to pass. - Manual feed query for a multi-window session returns exactly 1 tile.
Status
Fixed. Root cause confirmed via failing reproduction test; fix applied; test now passes.