Phone-Only Recording

Plan Metadata

Plan type: plan
Parent plan: N/A
Depends on: N/A
Status: approved

Status semantics: - draft: Plan is being created or updated and is not final. - approved: Plan is approved but not yet applied in code. - documentation: Code currently exists and matches the plan contract.

Update rule: - When an existing plan is edited, set status to draft until re-approved.

System Intent

What is being built: Phone-only audio recording path in capture-session.ts. When the user selects "phone" as their recording device, the phone's built-in microphone captures audio in 30-second windowed WAV chunks (matching the glasses audio-only flow) and uploads them through the existing PersistentUploadQueue and /sessions/{sessionId}/audio?windowIndex=<n> API.
Primary consumer(s): startCapture / stopCapture in main/app/lib/capture-session.ts, triggered via CaptureButton → recording-control-orchestrator.
Boundary (black-box scope only): The phone-only recording path is entirely inside capture-session.ts and a new phone-audio-capture.ts helper. No changes to the server API, PersistentUploadQueue, recording-control-orchestrator, or UI components.

Stage Gate Tracker

[x] Stage 1 Mermaid approved
[x] Stage 2 I/O contracts approved
[x] Stage 3 pseudocode/technical details approved or skipped

1. Mermaid Diagram

flowchart TD
  UserRecord([User — Record]) -->|tap record| RecordingOrchestrator["recording-control-orchestrator\nmain/app/lib/recording-control-orchestrator.ts"]:::unchanged
  RecordingOrchestrator -->|startCapture call| CaptureSession["capture-session.ts\nmain/app/lib/capture-session.ts"]:::updated
  CaptureSession -->|startPhoneAudioCapture call| PhoneAudioCapture["phone-audio-capture.ts\nmain/app/lib/phone-audio-capture.ts"]:::created
  PhoneAudioCapture -->|requestPermissionsAsync| ExpoAV["expo-av Audio.Recording\nexternal SDK"]:::unchanged
  ExpoAV -->|30-second WAV chunk URI| PhoneAudioCapture
  PhoneAudioCapture -->|AudioChunkEvent filePath and windowIndex| CaptureSession
  CaptureSession -->|enqueue audio/wav with sessionId and windowIndex| UploadQueue["PersistentUploadQueue\nmain/app/lib/persistent-upload-queue.ts"]:::unchanged
  UploadQueue -->|POST audio/wav windowIndex=n| SessionsAPI["Server API\n/sessions/id/audio"]:::unchanged
  SessionsAPI -->|stored| S3["S3 — external storage"]:::unchanged
  SessionsAPI -->|triggers on window complete| IngestLambda["Ingest Lambda — external"]:::unchanged

  UserStop([User — Stop]) -->|tap stop| RecordingOrchestrator
  RecordingOrchestrator -->|stopCapture call| CaptureSession
  CaptureSession -->|stopPhoneAudioCapture call| PhoneAudioCapture
  PhoneAudioCapture -->|final partial WAV chunk via callback| CaptureSession
  CaptureSession -->|flush queue then POST end| SessionsAPI

classDef unchanged fill:#d3d3d3,stroke:#666,stroke-width:1px;
classDef updated fill:#ffe58a,stroke:#666,stroke-width:1px;
classDef created fill:#a8e6a3,stroke:#666,stroke-width:1px;

2. Black-Box Inputs and Outputs

Global Types

RecordingDevice {
  value: "glasses" | "glasses-audio" | "phone"
}

AudioChunkEvent {
  filePath: string   (absolute file:// URI to WAV file on device)
  windowIndex: number (0-based sequential chunk index)
  durationMs: number  (actual duration of this chunk)
  sizeBytes: number   (file size in bytes)
}

UploadItem {
  id: string
  type: "audio"
  uri: string          (file:// URI)
  sessionId: string
  windowIndex: number
  retryCount: number
  sizeBytes: number
}

Flow: `startPhoneAudioCapture`

Test files: main/app/__tests__/phone-audio-capture.test.ts
Core files: main/app/lib/phone-audio-capture.ts

Type Definitions

StartPhoneAudioCaptureInput {
  onChunkReady: (event: AudioChunkEvent) => void   (callback invoked per 30-second WAV chunk)
}

StartPhoneAudioCaptureOutput {
  void   (resolves when recording starts; rejects on permission denied or hardware error)
}

Paths

path-name	input	output/expected state change	path-type	notes	updated
`startPhoneAudioCapture.success`	`StartPhoneAudioCaptureInput`	void; expo-av recording active, 30-second chunking timer running	`happy path`	requests microphone permission; creates expo-av Audio.Recording with WAV preset; starts 30-second interval to finalize and restart recording	Y
`startPhoneAudioCapture.permission-denied`	`StartPhoneAudioCaptureInput`	throws Error("Microphone permission denied")	`error`	no recording started; caller (startCapture) catches and calls cleanupSession	Y
`startPhoneAudioCapture.already-active`	`StartPhoneAudioCaptureInput`	throws Error("Phone audio capture already active")	`error`	guard against double-start	Y

Flow: `stopPhoneAudioCapture`

Test files: main/app/__tests__/phone-audio-capture.test.ts
Core files: main/app/lib/phone-audio-capture.ts

Type Definitions

StopPhoneAudioCaptureInput {
  void
}

StopPhoneAudioCaptureOutput {
  void   (resolves after final partial chunk is finalized and onChunkReady fired)
}

Paths

path-name	input	output/expected state change	path-type	notes	updated
`stopPhoneAudioCapture.success`	void	void; chunk interval cleared; current recording stopped; final partial WAV chunk delivered via onChunkReady	`happy path`	partial chunk (< 30s) is still delivered so no audio is lost at session end	Y
`stopPhoneAudioCapture.not-active`	void	void (no-op)	`subpath`	idempotent; safe to call when not recording	Y

Flow: `startCapture` (phone-mode branch)

Test files: main/app/__tests__/capture-session.test.ts
Core files: main/app/lib/capture-session.ts

Type Definitions

StartCaptureInput {
  selectedDevice: RecordingDevice   (read from recording-device-preference)
  sdkAvailable: boolean             (WearablesModule.isAnyDeviceAvailable())
}

StartCaptureOutput {
  void   (session created, phone audio capture running, upload queue active)
}

Paths

path-name	input	output/expected state change	path-type	notes	updated
`startCapture.phone.success`	`selectedDevice=phone`	void; server session created with captureMode=audio_only; PersistentUploadQueue initialized; phone audio capture active	`happy path`	mirrors glasses-audio flow: same captureMode, same upload queue, same session lifecycle	Y
`startCapture.phone.permission-denied`	`selectedDevice=phone`	throws; server session cleaned up via cleanupSession()	`error`	session is opened before capture starts (same as glasses flow); must be ended on failure	Y

Flow: `stopCapture` (phone-mode branch)

Test files: main/app/__tests__/capture-session.test.ts
Core files: main/app/lib/capture-session.ts

Type Definitions

StopCaptureInput {
  activeRecordingDevice: "phone"   (captured at startCapture time)
}

StopCaptureOutput {
  void   (phone audio stopped, final chunk enqueued, queue flushed, session ended)
}

Paths

path-name	input	output/expected state change	path-type	notes	updated
`stopCapture.phone.success`	`activeRecordingDevice=phone`	void; phone audio stopped; final chunk enqueued; upload queue flushed; /sessions/{id}/end called	`happy path`	mirrors glasses-audio teardown path: stop audio → flush queue → end session	Y
`stopCapture.phone.stop-error`	`activeRecordingDevice=phone` + stopPhoneAudioCapture throws	partial stop; error logged; queue still flushed; session still ended	`error`	same resilience pattern as glasses teardown — errors are caught and logged, not re-thrown before session end	Y

3. Pseudocode / Technical Details for Critical Flows (Optional)

New file: `main/app/lib/phone-audio-capture.ts`

import { Audio } from "expo-av";

CHUNK_DURATION_MS = 30_000

let activeRecording: Audio.Recording | null = null
let chunkTimer: NodeJS.Timeout | null = null
let windowIndex = 0
let chunkCallback: ((event: AudioChunkEvent) => void) | null = null

async function finalizeChunk(recording: Audio.Recording): Promise<void>
  await recording.stopAndUnloadAsync()
  const uri = recording.getURI()             // file:// URI on-device WAV
  if uri:
    chunkCallback?.({ filePath: uri, windowIndex, durationMs: CHUNK_DURATION_MS, sizeBytes: await getFileSize(uri) })
    windowIndex++

async function startNewChunk(): Promise<void>
  const recording = new Audio.Recording()
  await recording.prepareToRecordAsync(RECORDING_OPTIONS_PRESET_HIGH_QUALITY)
  await recording.startAsync()
  activeRecording = recording

export async function startPhoneAudioCapture(onChunkReady): Promise<void>
  if activeRecording !== null: throw new Error("Phone audio capture already active")
  const { granted } = await Audio.requestPermissionsAsync()
  if !granted: throw new Error("Microphone permission denied")
  await Audio.setAudioModeAsync({ allowsRecordingIOS: true, playsInSilentModeIOS: true })
  windowIndex = 0
  chunkCallback = onChunkReady
  await startNewChunk()
  chunkTimer = setInterval(async () =>
    const currentRecording = activeRecording
    await startNewChunk()            // start next BEFORE stopping current (overlap-safe)
    if currentRecording: await finalizeChunk(currentRecording)
  , CHUNK_DURATION_MS)

export async function stopPhoneAudioCapture(): Promise<void>
  if !activeRecording: return
  clearInterval(chunkTimer); chunkTimer = null
  const last = activeRecording
  activeRecording = null
  chunkCallback = null
  await finalizeChunk(last)         // flush partial final chunk

Changes to `main/app/lib/capture-session.ts`

// In startCapture():
if useWearablesCapture:
  // existing glasses path unchanged
else:
  // PHONE PATH (was empty stub)
  logger({ step: "capture_phone_mode_starting" })
  startAudioChunkListener()   // reuse same callback → enqueues to uploadQueue
  // But instead of WearablesModule, use phone-audio-capture:
  await startPhoneAudioCapture((event) =>
    uploadQueue?.enqueue("audio", toFileUri(event.filePath), {
      windowIndex: event.windowIndex,
      sizeBytes: event.sizeBytes,
    })
  )
  logger({ step: "capture_phone_mode_started" })

// In stopCapture():
if startedWithPhone:         // activeRecordingDevice === "phone"
  try:
    await stopPhoneAudioCapture()  // flushes last chunk via callback
    logger({ step: "capture_phone_audio_stopped" })
  catch e:
    logger({ step: "capture_phone_audio_stop_error", additional: { error: e.message } })
  // flush queue (same as glasses-audio path)
  if uploadQueue:
    await uploadQueue.flush()
  // then fall through to /sessions/{id}/end (existing code)

`captureMode` sent to server

Phone mode sends captureMode: "audio_only" (same as glasses-audio), since there are no frames. The existing code already sends audio_video as a fallback for phone; this will change to audio_only.

expo-av dependency

expo-av must be added to main/app/package.json. It is a first-party Expo package (no native linking ceremony beyond npx expo install). The Audio.Recording API is available on iOS and Android.

Audio recording options

Use Audio.RecordingOptionsPresets.HIGH_QUALITY which records 16-bit 44.1kHz mono WAV — compatible with the existing /sessions/{id}/audio endpoint which accepts audio/wav.

Implementation notes

activeRecordingDevice is captured at startCapture time and used in stopCapture — ensures the correct teardown path runs even if the user changes their preference mid-session. This is consistent with the existing glasses teardown guard.
stopPhoneAudioCapture must finalize the last partial chunk before resolving so the queue can flush it. This mirrors WearablesModule.stopAudioCapture() which fires one last onAudioChunkReady event for the partial buffer before resolving.
No frames are captured in phone mode. startFramePolling / stopFramesTeardown are not called.
The cleanupSession function (used on error) currently calls WearablesModule.stopAudioCapture() and WearablesModule.stopStreamSession(). It must also call stopPhoneAudioCapture() when the device was phone to avoid a leak.

After all stages are approved, apply .agent/skills/reconcile-plans/SKILL.md to propagate contract updates across linked plans.

Phone-Only Recording

Plan Metadata

System Intent

Stage Gate Tracker

1. Mermaid Diagram

2. Black-Box Inputs and Outputs

Global Types

Flow: startPhoneAudioCapture

Type Definitions

Paths

Flow: stopPhoneAudioCapture

Type Definitions

Paths

Flow: startCapture (phone-mode branch)

Type Definitions

Paths

Flow: stopCapture (phone-mode branch)

Type Definitions

Paths

3. Pseudocode / Technical Details for Critical Flows (Optional)

New file: main/app/lib/phone-audio-capture.ts

Changes to main/app/lib/capture-session.ts

captureMode sent to server

expo-av dependency

Audio recording options

Implementation notes

4. Handoff to Related Plan Reconciliation

Flow: `startPhoneAudioCapture`

Flow: `stopPhoneAudioCapture`

Flow: `startCapture` (phone-mode branch)

Flow: `stopCapture` (phone-mode branch)

New file: `main/app/lib/phone-audio-capture.ts`

Changes to `main/app/lib/capture-session.ts`

`captureMode` sent to server