Phone-Only Recording
Plan Metadata
- Plan type:
plan - Parent plan: N/A
- Depends on: N/A
- Status:
approved
Status semantics: - draft: Plan is being created or updated and is not final. - approved: Plan is approved but not yet applied in code. - documentation: Code currently exists and matches the plan contract.
Update rule: - When an existing plan is edited, set status to draft until re-approved.
System Intent
- What is being built: Phone-only audio recording path in
capture-session.ts. When the user selects "phone" as their recording device, the phone's built-in microphone captures audio in 30-second windowed WAV chunks (matching the glasses audio-only flow) and uploads them through the existingPersistentUploadQueueand/sessions/{sessionId}/audio?windowIndex=<n>API. - Primary consumer(s):
startCapture/stopCaptureinmain/app/lib/capture-session.ts, triggered viaCaptureButton→recording-control-orchestrator. - Boundary (black-box scope only): The phone-only recording path is entirely inside
capture-session.tsand a newphone-audio-capture.tshelper. No changes to the server API,PersistentUploadQueue,recording-control-orchestrator, or UI components.
Stage Gate Tracker
- [x] Stage 1 Mermaid approved
- [x] Stage 2 I/O contracts approved
- [x] Stage 3 pseudocode/technical details approved or skipped
1. Mermaid Diagram
flowchart TD
UserRecord([User — Record]) -->|tap record| RecordingOrchestrator["recording-control-orchestrator\nmain/app/lib/recording-control-orchestrator.ts"]:::unchanged
RecordingOrchestrator -->|startCapture call| CaptureSession["capture-session.ts\nmain/app/lib/capture-session.ts"]:::updated
CaptureSession -->|startPhoneAudioCapture call| PhoneAudioCapture["phone-audio-capture.ts\nmain/app/lib/phone-audio-capture.ts"]:::created
PhoneAudioCapture -->|requestPermissionsAsync| ExpoAV["expo-av Audio.Recording\nexternal SDK"]:::unchanged
ExpoAV -->|30-second WAV chunk URI| PhoneAudioCapture
PhoneAudioCapture -->|AudioChunkEvent filePath and windowIndex| CaptureSession
CaptureSession -->|enqueue audio/wav with sessionId and windowIndex| UploadQueue["PersistentUploadQueue\nmain/app/lib/persistent-upload-queue.ts"]:::unchanged
UploadQueue -->|POST audio/wav windowIndex=n| SessionsAPI["Server API\n/sessions/id/audio"]:::unchanged
SessionsAPI -->|stored| S3["S3 — external storage"]:::unchanged
SessionsAPI -->|triggers on window complete| IngestLambda["Ingest Lambda — external"]:::unchanged
UserStop([User — Stop]) -->|tap stop| RecordingOrchestrator
RecordingOrchestrator -->|stopCapture call| CaptureSession
CaptureSession -->|stopPhoneAudioCapture call| PhoneAudioCapture
PhoneAudioCapture -->|final partial WAV chunk via callback| CaptureSession
CaptureSession -->|flush queue then POST end| SessionsAPI
classDef unchanged fill:#d3d3d3,stroke:#666,stroke-width:1px;
classDef updated fill:#ffe58a,stroke:#666,stroke-width:1px;
classDef created fill:#a8e6a3,stroke:#666,stroke-width:1px; 2. Black-Box Inputs and Outputs
Global Types
RecordingDevice {
value: "glasses" | "glasses-audio" | "phone"
}
AudioChunkEvent {
filePath: string (absolute file:// URI to WAV file on device)
windowIndex: number (0-based sequential chunk index)
durationMs: number (actual duration of this chunk)
sizeBytes: number (file size in bytes)
}
UploadItem {
id: string
type: "audio"
uri: string (file:// URI)
sessionId: string
windowIndex: number
retryCount: number
sizeBytes: number
}
Flow: startPhoneAudioCapture
- Test files:
main/app/__tests__/phone-audio-capture.test.ts - Core files:
main/app/lib/phone-audio-capture.ts
Type Definitions
StartPhoneAudioCaptureInput {
onChunkReady: (event: AudioChunkEvent) => void (callback invoked per 30-second WAV chunk)
}
StartPhoneAudioCaptureOutput {
void (resolves when recording starts; rejects on permission denied or hardware error)
}
Paths
| path-name | input | output/expected state change | path-type | notes | updated |
|---|---|---|---|---|---|
startPhoneAudioCapture.success | StartPhoneAudioCaptureInput | void; expo-av recording active, 30-second chunking timer running | happy path | requests microphone permission; creates expo-av Audio.Recording with WAV preset; starts 30-second interval to finalize and restart recording | Y |
startPhoneAudioCapture.permission-denied | StartPhoneAudioCaptureInput | throws Error("Microphone permission denied") | error | no recording started; caller (startCapture) catches and calls cleanupSession | Y |
startPhoneAudioCapture.already-active | StartPhoneAudioCaptureInput | throws Error("Phone audio capture already active") | error | guard against double-start | Y |
Flow: stopPhoneAudioCapture
- Test files:
main/app/__tests__/phone-audio-capture.test.ts - Core files:
main/app/lib/phone-audio-capture.ts
Type Definitions
StopPhoneAudioCaptureInput {
void
}
StopPhoneAudioCaptureOutput {
void (resolves after final partial chunk is finalized and onChunkReady fired)
}
Paths
| path-name | input | output/expected state change | path-type | notes | updated |
|---|---|---|---|---|---|
stopPhoneAudioCapture.success | void | void; chunk interval cleared; current recording stopped; final partial WAV chunk delivered via onChunkReady | happy path | partial chunk (< 30s) is still delivered so no audio is lost at session end | Y |
stopPhoneAudioCapture.not-active | void | void (no-op) | subpath | idempotent; safe to call when not recording | Y |
Flow: startCapture (phone-mode branch)
- Test files:
main/app/__tests__/capture-session.test.ts - Core files:
main/app/lib/capture-session.ts
Type Definitions
StartCaptureInput {
selectedDevice: RecordingDevice (read from recording-device-preference)
sdkAvailable: boolean (WearablesModule.isAnyDeviceAvailable())
}
StartCaptureOutput {
void (session created, phone audio capture running, upload queue active)
}
Paths
| path-name | input | output/expected state change | path-type | notes | updated |
|---|---|---|---|---|---|
startCapture.phone.success | selectedDevice=phone | void; server session created with captureMode=audio_only; PersistentUploadQueue initialized; phone audio capture active | happy path | mirrors glasses-audio flow: same captureMode, same upload queue, same session lifecycle | Y |
startCapture.phone.permission-denied | selectedDevice=phone | throws; server session cleaned up via cleanupSession() | error | session is opened before capture starts (same as glasses flow); must be ended on failure | Y |
Flow: stopCapture (phone-mode branch)
- Test files:
main/app/__tests__/capture-session.test.ts - Core files:
main/app/lib/capture-session.ts
Type Definitions
StopCaptureInput {
activeRecordingDevice: "phone" (captured at startCapture time)
}
StopCaptureOutput {
void (phone audio stopped, final chunk enqueued, queue flushed, session ended)
}
Paths
| path-name | input | output/expected state change | path-type | notes | updated |
|---|---|---|---|---|---|
stopCapture.phone.success | activeRecordingDevice=phone | void; phone audio stopped; final chunk enqueued; upload queue flushed; /sessions/{id}/end called | happy path | mirrors glasses-audio teardown path: stop audio → flush queue → end session | Y |
stopCapture.phone.stop-error | activeRecordingDevice=phone + stopPhoneAudioCapture throws | partial stop; error logged; queue still flushed; session still ended | error | same resilience pattern as glasses teardown — errors are caught and logged, not re-thrown before session end | Y |
3. Pseudocode / Technical Details for Critical Flows (Optional)
New file: main/app/lib/phone-audio-capture.ts
import { Audio } from "expo-av";
CHUNK_DURATION_MS = 30_000
let activeRecording: Audio.Recording | null = null
let chunkTimer: NodeJS.Timeout | null = null
let windowIndex = 0
let chunkCallback: ((event: AudioChunkEvent) => void) | null = null
async function finalizeChunk(recording: Audio.Recording): Promise<void>
await recording.stopAndUnloadAsync()
const uri = recording.getURI() // file:// URI on-device WAV
if uri:
chunkCallback?.({ filePath: uri, windowIndex, durationMs: CHUNK_DURATION_MS, sizeBytes: await getFileSize(uri) })
windowIndex++
async function startNewChunk(): Promise<void>
const recording = new Audio.Recording()
await recording.prepareToRecordAsync(RECORDING_OPTIONS_PRESET_HIGH_QUALITY)
await recording.startAsync()
activeRecording = recording
export async function startPhoneAudioCapture(onChunkReady): Promise<void>
if activeRecording !== null: throw new Error("Phone audio capture already active")
const { granted } = await Audio.requestPermissionsAsync()
if !granted: throw new Error("Microphone permission denied")
await Audio.setAudioModeAsync({ allowsRecordingIOS: true, playsInSilentModeIOS: true })
windowIndex = 0
chunkCallback = onChunkReady
await startNewChunk()
chunkTimer = setInterval(async () =>
const currentRecording = activeRecording
await startNewChunk() // start next BEFORE stopping current (overlap-safe)
if currentRecording: await finalizeChunk(currentRecording)
, CHUNK_DURATION_MS)
export async function stopPhoneAudioCapture(): Promise<void>
if !activeRecording: return
clearInterval(chunkTimer); chunkTimer = null
const last = activeRecording
activeRecording = null
chunkCallback = null
await finalizeChunk(last) // flush partial final chunk
Changes to main/app/lib/capture-session.ts
// In startCapture():
if useWearablesCapture:
// existing glasses path unchanged
else:
// PHONE PATH (was empty stub)
logger({ step: "capture_phone_mode_starting" })
startAudioChunkListener() // reuse same callback → enqueues to uploadQueue
// But instead of WearablesModule, use phone-audio-capture:
await startPhoneAudioCapture((event) =>
uploadQueue?.enqueue("audio", toFileUri(event.filePath), {
windowIndex: event.windowIndex,
sizeBytes: event.sizeBytes,
})
)
logger({ step: "capture_phone_mode_started" })
// In stopCapture():
if startedWithPhone: // activeRecordingDevice === "phone"
try:
await stopPhoneAudioCapture() // flushes last chunk via callback
logger({ step: "capture_phone_audio_stopped" })
catch e:
logger({ step: "capture_phone_audio_stop_error", additional: { error: e.message } })
// flush queue (same as glasses-audio path)
if uploadQueue:
await uploadQueue.flush()
// then fall through to /sessions/{id}/end (existing code)
captureMode sent to server
Phone mode sends captureMode: "audio_only" (same as glasses-audio), since there are no frames. The existing code already sends audio_video as a fallback for phone; this will change to audio_only.
expo-av dependency
expo-av must be added to main/app/package.json. It is a first-party Expo package (no native linking ceremony beyond npx expo install). The Audio.Recording API is available on iOS and Android.
Audio recording options
Use Audio.RecordingOptionsPresets.HIGH_QUALITY which records 16-bit 44.1kHz mono WAV — compatible with the existing /sessions/{id}/audio endpoint which accepts audio/wav.
Implementation notes
activeRecordingDeviceis captured atstartCapturetime and used instopCapture— ensures the correct teardown path runs even if the user changes their preference mid-session. This is consistent with the existing glasses teardown guard.stopPhoneAudioCapturemust finalize the last partial chunk before resolving so the queue can flush it. This mirrorsWearablesModule.stopAudioCapture()which fires one lastonAudioChunkReadyevent for the partial buffer before resolving.- No frames are captured in phone mode.
startFramePolling/stopFramesTeardownare not called. - The
cleanupSessionfunction (used on error) currently callsWearablesModule.stopAudioCapture()andWearablesModule.stopStreamSession(). It must also callstopPhoneAudioCapture()when the device was phone to avoid a leak.
4. Handoff to Related Plan Reconciliation
After all stages are approved, apply .agent/skills/reconcile-plans/SKILL.md to propagate contract updates across linked plans.