360-Degree Panorama Viewer in Mobile App
360° panoramas unlike static 360° photos — either video in equirectangular format, or interactive scenes with points of interest (hotspots), transitions between locations, and audio guide. Technical requirements significantly higher than single photo: need 4K/8K video decoding in real time or manage set of high-res tiles with smooth transition.
Video Panoramas: Where Everything Breaks
360° video in equirectangular format on iOS decodes via AVPlayer with AVPlayerLayer, but AVPlayerLayer renders only in 2D. For spherical projection need AVPlayerItemVideoOutput + Metal or SceneKit.
Key problem: AVPlayerItemVideoOutput.copyPixelBuffer(forItemTime:itemTimeForDisplay:) blocks calling thread during frame decode. On iPhone 12 decoding 4K H.265 takes 8–15ms — half frame budget at 60fps. Call on CADisplayLink callback in main thread gives visible drops.
Correct: CADisplayLink only triggers Metal render pass, copying CVPixelBuffer — in separate DispatchQueue(qos: .userInteractive), result passed to Metal via CVMetalTextureCacheCreateTextureFromImage.
let displayLink = CADisplayLink(target: self, selector: #selector(renderFrame))
displayLink.preferredFrameRateRange = CAFrameRateRange(minimum: 60, maximum: 120, preferred: 120)
@objc func renderFrame() {
videoDecodeQueue.async { [weak self] in
guard let pixelBuffer = self?.videoOutput.copyPixelBuffer(
forItemTime: self!.playerItem.currentTime(),
itemTimeForDisplay: nil
) else { return }
self?.metalRenderer.render(pixelBuffer: pixelBuffer)
}
}
Tiled Panoramas (Virtual Tours)
For high-quality static panoramas (hotels, real estate, museums) use not single image but tiles like maps: multiple detail levels split in grid. When zooming (reducing FOV) tiles of higher resolution load.
Standard — Krpano tile format or Marzipano. For mobile render use custom Metal/OpenGL ES pipeline or PanoramaGL (Android). Tiles load via URLSession with priorities: first visible sphere quadrant, then neighboring 4 tiles (prefetch).
LRU cache for tiles — mandatory. Without it navigating virtual tour (10+ locations × 6 cube faces × 4 detail levels) memory grows to 800 MB in 15 minutes.
Hotspots and Interactivity
Hotspot — point in 3D sphere space, which when rendering becomes 2D element on screen (icon, tooltip, navigation button). Converting from spherical coordinates (yaw/pitch) to screen coordinates:
func sphericalToScreen(yaw: Float, pitch: Float,
cameraYaw: Float, cameraPitch: Float,
fov: Float, screenSize: CGSize) -> CGPoint? {
// Matrix transform: spherical → cartesian → camera projection → NDC → screen
let direction = SIMD3<Float>(
cos(pitch) * sin(yaw),
sin(pitch),
cos(pitch) * cos(yaw)
)
// ... view matrix × projection matrix → clip space → viewport
}
If dot(direction, cameraForward) < 0 — hotspot behind camera, don't render.
Hotspot appearance animation when entering FOV — fade-in via CABasicAnimation, not SwiftUI animation (SwiftUI overlay over Metal SCNView adds 2–3ms layout pass per frame).
Gyroscope and Compass
CMDeviceMotion via CMMotionManager gives quaternion device orientation up to 100 Hz. Convert to Euler angles for scene camera, apply Kalman filter for smoothing jitter. Without filter gyroscopic drift on old devices (~0.3° per second) over 5 minutes drifts "north" by 15°.
North direction at start — from CLLocationManager.heading, bind zero yaw to magnetic/true north.
Timeline
Video panorama 4K with Metal render (iOS): 2–3 weeks. Virtual tour with tiles, hotspots, location transitions, iOS + Android: 6–10 weeks depending on interactive elements count. Cost calculated individually.







