Bug: Audio Session Missing Subsequent Segment Transcriptions
Metadata
- Date reported: 2026-05-25
- Status: Fixed
- Root cause:
.first()query returning only the first segment instead of all - Affected files:
main/server/api/memories/transcript/app.py - Affected components: Transcript API, Transcript Display
- Symptom: Only the first segment's transcript appears; all subsequent segments are missing
Symptom
When a user views an audio session in the mobile app, the TranscriptDisplay component shows only the first ~30-second segment's transcript. All subsequent segments (windows) of the session have no transcription visible, even though they were transcribed during ingest.
Example: - Session has 3 audio windows (90 seconds total) - Window 0: "Hello, how are you?" ✓ (visible) - Window 1: "I'm doing well thanks." ✗ (missing) - Window 2: "Let's grab lunch soon." ✗ (missing)
Root Cause
The transcript API endpoint (/memories/{memory_id}/transcript) uses .first() to retrieve transcripts:
# BUGGY CODE:
segs = (
db.query(WorldMMSegment)
.filter(
WorldMMSegment.user_id == user_id,
WorldMMSegment.source_session_id == memory_id,
WorldMMSegment.transcript.isnot(None),
)
.order_by(WorldMMSegment.start_time, WorldMMSegment.id)
.first() # ← Returns only the first matching segment
)
When a session is grouped (multiple windows sharing the same source_session_id), the query returns a list of matching segments but only extracts the first one. The remaining segments are discarded.
Evidence
- Schema:
WorldMMSegmenthassource_window_indexfield, indicating a session can have multiple segments: - source_session_id: "abc-123"
-
source_window_index: 0, 1, 2 (three windows)
-
Ingest pipeline:
ingest_window.pycorrectly creates one segment per window with the samesource_session_id -
Feed API:
api/memories/feed/app.pyretrieves all segments for a session and concatenates transcripts: -
Transcript API inconsistency: Returns only the first segment, contradicting the feed's multi-segment support
Fix
Changed .first() to .all() and concatenate all transcripts in chronological order:
# FIXED CODE:
segs = (
db.query(WorldMMSegment)
.filter(
WorldMMSegment.user_id == user_id,
WorldMMSegment.source_session_id == memory_id,
WorldMMSegment.transcript.isnot(None),
)
.order_by(WorldMMSegment.start_time, WorldMMSegment.id)
.all() # ← Returns all matching segments
)
# Concatenate all transcripts in order
full_transcript = " ".join(
seg.transcript.strip() for seg in segs if seg.transcript
)
Files Changed
main/server/api/memories/transcript/app.py_find_transcript_segment(): Changed from returning single segment to returning listimplementation(): Updated to concatenate all transcripts and use first segment's metadata
Verification
- Query a session with multiple transcribed windows
- Call
GET /memories/{session_id}/transcript - Verify response contains concatenated text from all windows, in chronological order
- Spot check: first window's phrase + second window's phrase should both appear in response
Related Systems
- Transcript display:
main/app/components/memory/TranscriptDisplay.tsx - Feed API:
main/server/api/memories/feed/app.py(already supports multi-window sessions) - Ingest pipeline:
main/server/worldmm/pipeline/ingest_window.py(correctly creates per-window segments)
Follow-up
Consider adding integration test to catch regressions in transcript concatenation.