Ingest Window: Naive Datetime in _get_session_created_at + Retry Timestamp Drift
Metadata
- Date:
2026-04-23 - Status:
fixed - Severity:
major - Related issue/ticket:
N/A - Owner:
N/A
About
Overview:
Two related bugs in main/server/worldmm/pipeline/ingest_window.py:
-
_get_session_created_atreturns a timezone-naivedatetimewhen DynamoDB stores a value without a timezone suffix (e.g."2026-04-23T10:00:00"instead of"2026-04-23T10:00:00+00:00").datetime.fromisoformatdoes not attach a timezone in that case. When.timestamp()is then called on a naive datetime, Python treats it as local time, not UTC, shiftingwindow_start_sby the server's UTC offset and producing wrongstart_iso/end_isofor the segment and visual embedding. -
On a retry delivery (
existingis notNoneandprocessing_status != "complete"),start_iso/end_isoare recomputed from_get_session_created_atbefore the idempotency branch. If the DynamoDB lookup fails and falls back tonow(), the recomputed timestamps diverge from the already-persistedexisting.start_time/existing.end_time.store_visual_embeddingreceives a staletimestamp, breaking time-based retrieval for the retry'd segment.
Technical Questions: - Why does fromisoformat return a naive datetime? DynamoDB stores the value as written by the client; if the client omits the timezone designator the returned string is naive. - Why is the retry path affected? Lines 153-156 run unconditionally before the if existing: branch, so both new and retry paths share the same (possibly-wrong) timestamp calculation.
Reproduction Test
main/server/tests/unit/test_ingest_window_datetime.py — added as part of fix.
Root Cause
- Naive datetime:
datetime.fromisoformat(raw)does not normalise to UTC when the raw string lacks a+00:00suffix. Fix: call.replace(tzinfo=timezone.utc)when the result is naive. - Retry drift:
start_iso/end_isomust come from the existing DB row on retry, not be recomputed. Fix: after the idempotency check, overridestart_iso/end_isofromexisting.start_time/existing.end_timewhenexistingis set.
Fix Summary
_get_session_created_at: normalise naive result to UTC via.replace(tzinfo=timezone.utc), add a logger warning on fallback.lambda_handler: movestart_iso/end_isoderivation so the retry branch reads fromexistingrather than recomputing.
Verification
- New unit tests pass:
test_ingest_window_datetime.py. - Existing tests unchanged.