Skip to content

Audio API

Metadata

  • System type: library

System Intent

  • What this is: The audio API covers two layers. The frontend client (main/app/lib/api/memory/audio.ts) provides two TypeScript helper functions — getMemoryAudioMetadata and getMemoryAudio — that call backend endpoints via axios with a 120 s timeout and validate responses with zod. The backend Lambda (main/server/api/memories/audio/app.py) implements both POST /memories/audio-metadata and POST /memories/audio endpoints using a single shared handler. The handler looks up the WorldMMSegment by memory_id, derives the S3 audio key from source_session_id and source_window_index, and returns metadata and a presigned S3 URL for complete segments.

Mermaid Diagram

flowchart TD
  AudioPlayer[AudioPlayer] -->|getMemoryAudioMetadata memory_id| MetaEndpoint[POST /memories/audio-metadata]
  MetaEndpoint -->|queries WorldMMSegment| DB[(PostgreSQL)]
  DB -->|duration_seconds, processing_status| MetaEndpoint
  MetaEndpoint -->|duration_seconds, processing_status| AudioPlayer
  AudioPlayer -->|if status complete: getMemoryAudio memory_id| AudioEndpoint[POST /memories/audio]
  AudioEndpoint -->|head_object + generate_presigned_url| S3[(S3 encache-raw-memory)]
  S3 -->|presigned_url| AudioEndpoint
  AudioEndpoint -->|presigned_url, duration_seconds, processing_status| AudioPlayer

Flows

Flow: getMemoryAudioMetadata

  • Core files: main/app/lib/api/memory/audio.ts
  • Test files: none currently

Types

GetMemoryAudioMetadataParams {
  memory_id: string
}

GetMemoryAudioMetadataResponse {
  duration_seconds: number
  processing_status: "complete" | "pending" | "failed"
  error?: string
}

Paths

path input output path-type notes
getMemoryAudioMetadata.success valid memory_id GetMemoryAudioMetadataResponse happy path
getMemoryAudioMetadata.timeout request exceeds 120 s axios timeout error thrown error
getMemoryAudioMetadata.zodFail unexpected response shape zod parse error thrown error

Flow: getMemoryAudio

  • Core files: main/app/lib/api/memory/audio.ts
  • Test files: none currently

Types

GetMemoryAudioParams {
  memory_id: string
}

GetMemoryAudioResponse {
  presigned_url: string   (validated as URL by zod)
}

Paths

path input output path-type notes
getMemoryAudio.success valid memory_id GetMemoryAudioResponse with presigned_url happy path
getMemoryAudio.timeout request exceeds 120 s axios timeout error thrown error audio processing Lambda timeout is 120 s
getMemoryAudio.zodFail presigned_url missing or not a URL zod parse error thrown error

Pseudocode

const _AUDIO_REQUEST_TIMEOUT_MS = 120_000

getMemoryAudio({ memory_id }):
  api = getApi()
  response = await api.post("/memories/audio", { memory_id }, { timeout: 120_000 })
  return GetMemoryAudioResponseSchema.parse(response.data.data)

getMemoryAudioMetadata({ memory_id }):
  api = getApi()
  response = await api.post("/memories/audio-metadata", { memory_id }, { timeout: 120_000 })
  return GetMemoryAudioMetadataResponseSchema.parse(response.data.data)

Flow: memoriesAudioLambda

  • Core files: main/server/api/memories/audio/app.py
  • Test files: none currently

Both POST /memories/audio and POST /memories/audio-metadata are handled by this single Lambda function. Two SAM resources (MemoriesAudioFunction, MemoriesAudioMetadataFunction) share the same CodeUri and Handler, so the same implementation() function serves both routes.

Types

LambdaRequest {
  memory_id: string   (required, stripped of whitespace)
}

LambdaResponse {
  presigned_url: string | null   (null when processing_status != "complete")
  duration_seconds: float
  processing_status: "complete" | "pending" | "failed"
}

StandardError {
  code: "INVALID_REQUEST"   — memory_id missing or blank
  code: "MEMORY_NOT_FOUND"  — no segment row found for (memory_id, user_id)
  code: "AUDIO_NOT_FOUND"   — segment lacks source_session_id/source_window_index,
                               or processing_status=="complete" but S3 object missing
}

Paths

path input output path-type notes
memoriesAudioLambda.complete memory_id, segment.processing_status = complete, S3 object exists presigned_url + metadata happy path presigned URL TTL = 3600 s
memoriesAudioLambda.pending memory_id, segment.processing_status = pending presigned_url=null + metadata happy path client shows processing overlay
memoriesAudioLambda.failed memory_id, segment.processing_status = failed presigned_url=null + metadata happy path client shows processing-failed overlay
memoriesAudioLambda.missingId memory_id blank INVALID_REQUEST 400 error
memoriesAudioLambda.notFound memory_id not in DB for authenticated user MEMORY_NOT_FOUND 404 error
memoriesAudioLambda.noAudioPath segment.source_session_id or source_window_index null AUDIO_NOT_FOUND 404 error segment not linked to a session window
memoriesAudioLambda.s3Missing processing_status=complete but S3 object absent AUDIO_NOT_FOUND 404 error indicates pipeline gap

Pseudocode

implementation(payload, auth):
  memory_id = payload["memory_id"].strip()
  if not memory_id: raise InvalidInputError("INVALID_REQUEST")

  configure_database()

  # Lookup: try source_session_id first (grouped feed tiles), then segment id
  segment = db.query(WorldMMSegment)
    .filter(user_id=auth.user_id, source_session_id=memory_id)
    .order_by(start_time, id).first()
  if not segment:
    segment = db.query(WorldMMSegment)
      .filter(user_id=auth.user_id, id=memory_id).first()
  if not segment: raise NotFoundError("MEMORY_NOT_FOUND")

  if not segment.source_session_id or segment.source_window_index is None:
    raise NotFoundError("AUDIO_NOT_FOUND")

  audio_key = f"sessions/{segment.source_session_id}/window_{segment.source_window_index:03d}/audio.wav"

  presigned_url = None
  if segment.processing_status == "complete":
    s3.head_object(Bucket=BUCKET, Key=audio_key)   # raises 404 if missing
    presigned_url = s3.generate_presigned_url("get_object", ..., ExpiresIn=3600)

  return {
    "presigned_url": presigned_url,
    "duration_seconds": float(segment.duration_seconds),
    "processing_status": segment.processing_status,
  }

Logs

Source Location
React Native Metro / device console
Lambda (audio) CloudWatch: /aws/lambda/MemoriesAudioFunction
Lambda (metadata) CloudWatch: /aws/lambda/MemoriesAudioMetadataFunction

Deployment

  • Mechanism: SAM
  • Deploy command:
    cd main/server && sam build && sam deploy
    
  • Notes: Both Lambda functions share the same CodeUri: api/memories/audio/ and handler app.lambda_handler. Each has Timeout: 120, MemorySize: 512, VPC config for database access, and policies for DatabaseSsmPolicy and S3AccessPolicy. The 120 s timeout on both the Lambda and the frontend axios call is required to handle cold-start latency when the audio processing pipeline has been idle. The frontend client (main/app/lib/api/memory/audio.ts) validates responses with zod and uses a matching 120 s axios timeout.