Integrating ARKit into iOS Application
ARKit 6 — not just overlaying 3D objects on camera. It's a combination of LiDAR scanner, Visual Inertial Odometry, and neural network models for scene understanding, which when used incorrectly causes unstable tracking, occlusion artifacts, and crashes when ARSession is in background.
What Breaks Most Often
Tracking drops in poor lighting. ARWorldTrackingConfiguration builds a map from feature points — in dark rooms or on monochrome surfaces, tracking degrades to ARCamera.TrackingState.limited(.insufficientFeatures). You can't ignore this state: if you don't show the user that the scene needs lighting, anchors start to drift, and the 3D object flies off. Correct approach — subscribe to session(_:cameraDidChangeTrackingState:) and show appropriate instructions.
Planes are detected with delay. ARPlaneDetection.horizontal finds floor in 2–4 seconds with good lighting. If user tries to place object earlier — miss. Common solution: make plane detection optional and add raycast against ARMeshAnchor with LiDAR — on iPhone 12 Pro+ this gives hit accuracy ~1 cm without waiting.
Occlusion works only with LiDAR. ARView.environment.sceneUnderstanding.options with .occlusion works exclusively on LiDAR devices (Pro series from iPhone 12). On iPhone without LiDAR, object is always on top of real world. You need to check ARWorldTrackingConfiguration.supportsSceneReconstruction(.mesh) at start and degrade gracefully.
Stack and Approaches
For most AR tasks, use RealityKit on top of ARKit — it abstracts low-level Metal rendering and supports physics, USDZ animations, occlusion. For complex cases with custom shaders or SceneKit integration, work directly through ARSCNView.
Basic session configuration for object placement:
let config = ARWorldTrackingConfiguration()
config.planeDetection = [.horizontal, .vertical]
config.environmentTexturing = .automatic
if ARWorldTrackingConfiguration.supportsSceneReconstruction(.mesh) {
config.sceneReconstruction = .meshWithClassification
}
arView.session.run(config, options: [.resetTracking, .removeExistingAnchors])
environmentTexturing = .automatic enables environment map capture and applies it to PBR materials — objects reflect real scene lighting.
Object placement by raycast (correct modern way, not deprecated hitTest):
let query = arView.makeRaycastQuery(
from: arView.center,
allowing: .estimatedPlane,
alignment: .horizontal
)
if let query = query,
let result = arView.session.raycast(query).first {
let anchor = ARAnchor(name: "placed-object", transform: result.worldTransform)
arView.session.add(anchor: anchor)
let modelEntity = try! ModelEntity.loadModel(named: "model.usdz")
let anchorEntity = AnchorEntity(anchor: anchor)
anchorEntity.addChild(modelEntity)
arView.scene.addAnchor(anchorEntity)
}
Working with Image Tracking
ARImageTrackingConfiguration with maximumNumberOfTrackedImages = 4 — for marker-based AR. Load reference images from ARReferenceImage with physical size:
let referenceImage = ARReferenceImage(
cgImage,
orientation: .up,
physicalWidth: 0.15 // 15 cm in real world
)
config.trackingImages = [referenceImage]
physicalWidth is critical — ARKit uses it to compute distance and scale. Wrong size = object is in wrong place.
Face Tracking and TrueDepth
ARFaceTrackingConfiguration works only on TrueDepth devices (Face ID). 52 blend shape coefficients available through ARFaceAnchor.blendShapes — from .jawOpen to .eyeBlinkLeft. This builds virtual try-ons, AR avatars, and Liveness Detection.
func session(_ session: ARSession, didUpdate anchors: [ARAnchor]) {
guard let faceAnchor = anchors.first as? ARFaceAnchor else { return }
let jawOpen = faceAnchor.blendShapes[.jawOpen]?.floatValue ?? 0
// Control character animation
}
Performance and Profiling
ARKit + RealityKit load GPU and CPU simultaneously. Diagnostic tools: Xcode's Reality Composer Pro for scene preview, GPU Frame Capture for analyzing draw calls, Metal System Trace in Instruments for GPU bubble diagnostics.
Typical bottlenecks:
- USDZ with > 100K polygons on mid-range — FPS drops to 30
- Multiple
ModelEntitywithout instancing duplicate geometry in memory -
arView.renderOptions— disabling.disableDepthOfFieldand.disableMotionBlurgives +15% FPS on older devices
Development Stages
Requirements audit: which devices to support, LiDAR needed, tracking type (world/image/face/body). Degradation design for incompatible devices. AR core development with session and placement. 3D content integration (FBX/OBJ to USDZ conversion via Reality Converter). Testing on physical devices — simulator doesn't support AR. Performance optimization for target iPhone/iPad models.
Timeline
Basic ARKit integration with USDZ object placement on plane: 3–5 days. Image tracking with markers and content overlay: 4–7 days. Complex solution with face tracking, LiDAR occlusion, and custom Metal shaders: 3–6 weeks. Cost depends on 3D content complexity and supported device requirements.







