Bug: Audio Session Missing Subsequent Segment Transcriptions

Metadata

Date reported: 2026-05-25
Status: Fixed
Root cause: .first() query returning only the first segment instead of all
Affected files: main/server/api/memories/transcript/app.py
Affected components: Transcript API, Transcript Display
Symptom: Only the first segment's transcript appears; all subsequent segments are missing

Symptom

When a user views an audio session in the mobile app, the TranscriptDisplay component shows only the first ~30-second segment's transcript. All subsequent segments (windows) of the session have no transcription visible, even though they were transcribed during ingest.

Example: - Session has 3 audio windows (90 seconds total) - Window 0: "Hello, how are you?" ✓ (visible) - Window 1: "I'm doing well thanks." ✗ (missing) - Window 2: "Let's grab lunch soon." ✗ (missing)

Root Cause

The transcript API endpoint (/memories/{memory_id}/transcript) uses .first() to retrieve transcripts:

# BUGGY CODE:
segs = (
    db.query(WorldMMSegment)
    .filter(
        WorldMMSegment.user_id == user_id,
        WorldMMSegment.source_session_id == memory_id,
        WorldMMSegment.transcript.isnot(None),
    )
    .order_by(WorldMMSegment.start_time, WorldMMSegment.id)
    .first()  # ← Returns only the first matching segment
)

When a session is grouped (multiple windows sharing the same source_session_id), the query returns a list of matching segments but only extracts the first one. The remaining segments are discarded.

Evidence

Schema: WorldMMSegment has source_window_index field, indicating a session can have multiple segments:
source_session_id: "abc-123"
source_window_index: 0, 1, 2 (three windows)
Ingest pipeline: ingest_window.py correctly creates one segment per window with the same source_session_id

Feed API: api/memories/feed/app.py retrieves all segments for a session and concatenates transcripts:

for seg in session_segments:
    if seg.transcript:
        raw = seg.transcript.strip()

Transcript API inconsistency: Returns only the first segment, contradicting the feed's multi-segment support

Fix

Changed .first() to .all() and concatenate all transcripts in chronological order:

# FIXED CODE:
segs = (
    db.query(WorldMMSegment)
    .filter(
        WorldMMSegment.user_id == user_id,
        WorldMMSegment.source_session_id == memory_id,
        WorldMMSegment.transcript.isnot(None),
    )
    .order_by(WorldMMSegment.start_time, WorldMMSegment.id)
    .all()  # ← Returns all matching segments
)

# Concatenate all transcripts in order
full_transcript = " ".join(
    seg.transcript.strip() for seg in segs if seg.transcript
)

Files Changed

main/server/api/memories/transcript/app.py
_find_transcript_segment(): Changed from returning single segment to returning list
implementation(): Updated to concatenate all transcripts and use first segment's metadata

Verification

Query a session with multiple transcribed windows
Call GET /memories/{session_id}/transcript
Verify response contains concatenated text from all windows, in chronological order
Spot check: first window's phrase + second window's phrase should both appear in response

Transcript display: main/app/components/memory/TranscriptDisplay.tsx
Feed API: main/server/api/memories/feed/app.py (already supports multi-window sessions)
Ingest pipeline: main/server/worldmm/pipeline/ingest_window.py (correctly creates per-window segments)

Follow-up

Consider adding integration test to catch regressions in transcript concatenation.