Audio Upload 502 — NoSuchBucket on Record Start

Metadata

Date: 2026-04-20
Status: deployed
Severity: critical
Related issue/ticket: N/A
Owner: lewibs

About

Overview: - Pressing the record button immediately fails: every POST /sessions/{id}/audio?windowIndex=0 returns HTTP 502. - The app retries twice (retriesExhausted: true), then stops capture (capture_stop_frames, capture_stop_stream). Recording is completely broken. - The 502 is an API Gateway Bad Gateway — the Lambda threw an unhandled exception.

Technical Questions: - Why does AudioPostFunction target the wrong bucket? The SAM parameter BucketName defaults to raw-memory-data-ef5syyis8b, but the bucket created by Terraform is encache-raw-memory. The parameter default was never updated to match the Terraform-managed bucket name. - Why didn't other endpoints fail? DynamoDB, SSM, and RDS-accessing Lambdas don't call S3 on the hot path, so the wrong bucket name was silent until audio recording was tested.

Resources: - main/server/api/sessions/audio/app.py:13 — BUCKET = os.environ.get("BUCKET_NAME", "raw-memory-data-ef5syyis8b") - main/server/template.yaml:9 — BucketName parameter default - main/server/template.yaml:46 — global BUCKET_NAME env var (injected into all Lambdas) - main/app/lib/capture-session.ts:82 — uploadAudioChunk calls /sessions/${sessionId}/audio?windowIndex=${windowIndex} - CloudWatch log group: /aws/lambda/server-AudioPostFunction-uSd4CVCWdfrc

Steps to cause failure

flowchart LR
    A[User presses Record] --> B[App calls POST /sessions/{id}/audio?windowIndex=0]
    B --> C[API Gateway → AudioPostFunction Lambda]
    C --> D[s3.put_object to raw-memory-data-ef5syyis8b]
    D --> E[NoSuchBucket exception — bucket is encache-raw-memory]
    E --> F[Unhandled exception → API Gateway returns 502]
    F --> G[App retries 2x, all 502 → retriesExhausted]
    G --> H[capture_stop_frames / capture_stop_stream]

System

flowchart TD
    App[Mobile App\ncapture-session.ts] -->|POST /sessions/{id}/audio?windowIndex=N| APIGW[API Gateway]
    APIGW --> Lambda[AudioPostFunction\nBUCKET_NAME=raw-memory-data-ef5syyis8b]
    Lambda -->|PutObject — wrong bucket| S3Wrong["raw-memory-data-ef5syyis8b\n(does not exist)"]
    Lambda -. should target .-> S3Real["encache-raw-memory\n(actual bucket)"]
    Lambda --> DynamoDB[encache-sessions table\ncompletedAudioWindows ADD]

Reproduction Details

Build and deploy SAM stack without overriding BucketName parameter (uses default raw-memory-data-ef5syyis8b).
Press the Record button in the app.
App sends POST /sessions/{id}/audio?windowIndex=0.
Lambda logs: [ERROR] NoSuchBucket: ... PutObject ... The specified bucket does not exist.
API Gateway returns 502. App retries twice, all fail.

Reproduction test: N/A — infrastructure mismatch only reproducible in live AWS environment.

Notes for PR

Root cause: the SAM BucketName parameter (line 9 in template.yaml) defaults to raw-memory-data-ef5syyis8b. The actual S3 bucket is encache-raw-memory (created and named by Terraform). Every Lambda that calls S3 receives the wrong BUCKET_NAME env var, causing NoSuchBucket on all S3 writes.

Fix: update the BucketName default in template.yaml from raw-memory-data-ef5syyis8b to encache-raw-memory, then redeploy with sam deploy.

The hardcoded fallback default in app.py:13 (os.environ.get("BUCKET_NAME", "raw-memory-data-ef5syyis8b")) is also stale but is not load-bearing — the Lambda always gets the env var from the SAM global, so the Python default is never reached in production.

Audit Log

ID	Action	Note	Context
1	Create audit log	Record button fails; logs show 502 on every audio upload attempt	user logs
2	Search existing bugs	No prior bug for audio 502 / NoSuchBucket	docs/bugs scan
3	Read CloudWatch logs	`[ERROR] NoSuchBucket: PutObject ... raw-memory-data-ef5syyis8b` on both invocations matching session `8fc83173`	/aws/lambda/server-AudioPostFunction-uSd4CVCWdfrc
4	Root cause confirmed	Lambda env `BUCKET_NAME=raw-memory-data-ef5syyis8b`, actual bucket is `encache-raw-memory` — names mismatch since Terraform named it differently from the SAM default	`aws s3 ls` + Lambda env check
5	Apply fix	Updated `BucketName` default in `template.yaml:9` from `raw-memory-data-ef5syyis8b` to `encache-raw-memory`	template.yaml

Verification

[x] Reproduced failure before fix — CloudWatch confirms NoSuchBucket on every audio upload
[ ] Reproduction test fails before fix — N/A (infrastructure-only; no unit-testable path)
[x] Root cause identified with evidence — CloudWatch logs + aws s3 ls + Lambda env var
[x] Fix applied at source — corrected BucketName default, not a workaround
[x] Reproduction test passes after fix — N/A (infrastructure-only)
[ ] Reproduction path now passes — pending live device re-test

TODO — Benjamin: Run terraform import aws_s3_bucket.raw_data encache-raw-memory from main/devops/ to sync Terraform state with the actual bucket. Then verify recording works on device. - [x] Regression test added/updated — N/A (S3 bucket name is infrastructure config, not code logic) - [x] Verified no duplicate solved-bug log exists for same root cause