Multi-Camera Streaming from Mobile Device
Simultaneous capture from front and main camera — a task that became technically feasible on iOS with AVCaptureMultiCamSession in iOS 13. Before that any "multi-camera" stream was simulation: switching with delay or pre-recorded second source. Now on iPhone XS and newer can capture both streams simultaneously — with limitations below.
Architectural Limits You Need to Know First
AVCaptureMultiCamSession not supported on all devices. Before initialization mandatory check:
guard AVCaptureMultiCamSession.isMultiCamSupported else {
// fallback to AVCaptureSession with one camera
return
}
With active multi-cam session max resolution per camera reduced — on iPhone 13 Pro max 1920×1440 from main and 1280×960 from front simultaneously. Attempting 4K on both causes AVCaptureSessionRuntimeErrorNotification with AVError.outOfMemory code. Production — fix 1280×720 for both, sufficient for stream.
Thermal regime. Simultaneous dual ISP (Image Signal Processor) and GPU composition heat device quickly. On iPhone 12 mini at 30-minute dual-camera stream thermal throttling triggers, system forces framerate down to 20fps. Solution — monitor ProcessInfo.ThermalState, on .serious switch to one camera.
Android with Camera2 API: simultaneous capture supported via CameraManager.getCameraCharacteristics + LOGICAL_MULTI_CAMERA or explicitly opening two physical cameras. Practically support depends on OEM — Samsung Galaxy S22+ delivers two streams, budget Xiaomi may return ERROR_CAMERA_IN_USE. Before release mandatory testing on target device pool.
How We Build Pipeline
On iOS scheme: AVCaptureMultiCamSession → two AVCaptureDeviceInput → two AVCaptureVideoDataOutput → Metal compositor → VideoToolbox encoder → RTMP/SRT.
Key moment — composition. Two video streams can't directly feed one encoder. Need to mix frames via MTKView or CIFilter. Use Metal with custom shader: main camera full-frame, front rendered as PiP-rectangle in corner.
// Get CMSampleBuffer from each camera in different queues
let backQueue = DispatchQueue(label: "back.camera")
let frontQueue = DispatchQueue(label: "front.camera")
backOutput.setSampleBufferDelegate(self, queue: backQueue)
frontOutput.setSampleBufferDelegate(self, queue: frontQueue)
// Synchronize via AVCaptureDataOutputSynchronizer
let synchronizer = AVCaptureDataOutputSynchronizer(
dataOutputs: [backOutput, frontOutput]
)
synchronizer.setDelegate(self, queue: syncQueue)
AVCaptureDataOutputSynchronizer — mandatory element. Without it frames from two cameras arrive desynchronized up to 33ms (one frame at 30fps), PiP window shows "jitter" relative to main stream.
For SRT broadcasting (more stable alternative to RTMP on mobile) — use libsrt compiled for iOS/Android or HaishinKit 2.x with built-in SRT support.
Managing PiP Position During Stream
Make PiP position draggable via UIPanGestureRecognizer with snap-to-corners via animation. Save coordinates in UserDefaults — user shouldn't reposition each time.
On orientation change Metal shader gets new PiP coordinates automatically via CADisplayLink recalculating layout each frame.
Common Mistakes
- Don't add
AVCaptureMultiCamSessionto background modes (UIBackgroundModes: audio) — on app suspend iOS kills session after 30 seconds - Ignore
sessionWasInterruptedon incoming call — need to pause stream and resume onsessionInterruptionEnded - Use
DispatchQueue.mainforCMSampleBufferprocessing — decoding and Metal rendering on main thread drops UI by 8–12ms per frame
Timeline
iOS implementation with Metal compositor, PiP, SRT/RTMP streaming, thermal regime tests: 4–6 weeks. Android with Camera2 API — plus 2–3 weeks due to device fragmentation. Cost calculated individually after requirements analysis.







