Implementing Image Tracking (AR Marker Recognition) in AR Applications
Image tracking is attaching AR content to a physical image: product packaging, poster, business card, book page. The user points the camera at it — the image "comes alive". Technically, this is a well-researched problem, but in production you regularly encounter the same pitfalls: tracking "shakes" on glossy surfaces, is lost with partial occlusion, doesn't scale to large marker catalogs.
ARKit Image Tracking: How It Works
ARImageTrackingConfiguration is a configuration for tracking without world tracking. Faster initialization, lower CPU load, but no plane detection and world anchors.
ARWorldTrackingConfiguration with detectionImages is for tracking markers within full world tracking context. Needed when AR content must exist in world space between frames or when plane detection is required simultaneously with image tracking.
Reference image preparation: ARReferenceImage with physical dimensions (physicalSize). Size is mandatory — ARKit uses it to compute distance and scale. Wrong size → object in wrong scale.
guard let image = UIImage(named: "marker"),
let cgImage = image.cgImage else { return }
let referenceImage = ARReferenceImage(cgImage, orientation: .up, physicalSize: CGSize(width: 0.15, height: 0.10))
referenceImage.name = "product_label"
config.detectionImages = [referenceImage]
config.maximumNumberOfTrackedImages = 4
maximumNumberOfTrackedImages is a critical parameter. ARKit A12+ can track up to 100 images simultaneously (detection), but active position tracking is up to 4 on older chips, up to 8 on A14+. The difference: detected means we know the marker exists; tracked means we know the exact real-time position.
Marker Quality and Why "Any Image" Doesn't Work
ARKit evaluates a quality score for each reference image. Images with low quality scores track unstably or don't detect at all. Verification: add image to Xcode AR Resources group → inspector shows warning if quality is low.
Bad markers:
- Monochromatic or with large monochromatic areas (logo on white background)
- Symmetric patterns (ARKit gets confused about orientation)
- Low-contrast, faded images
- Text without other visual elements
Good markers:
- High-contrast, heterogeneous patterns (magazine covers, detailed illustrations)
- Asymmetric — ARKit unambiguously determines orientation
- Physical size from 10 cm — smaller markers track only from within 30 cm
Tracking on Glossy Surfaces
Packaging with glossy finish, holographic stickers, foil elements — all produce glare that changes the marker's appearance depending on lighting angle. ARKit loses tracking because feature points "float".
Solution at product level: matte lamination instead of gloss on marker zone. In code: hysteresis for tracking loss — don't immediately hide AR content on trackingState == .limited, but with 0.5–1 second delay. Most brief losses recover automatically.
ARCore Image Tracking
AugmentedImageDatabase is the analogue of ARKit detection images. Compile the database in advance via arcoreimg utility (command line) or AugmentedImageDatabase(session:imageBytes:) at runtime. Pre-compiled database loads faster.
ARCore additionally provides AugmentedImage.getTrackingMethod(): FULL_TRACKING (full tracking) vs LAST_KNOWN_POSE (last known position). LAST_KNOWN_POSE allows preserving AR content at position even with temporary marker loss from frame.
Marker Catalog and Content Management
For applications with large catalogs (100+ markers — e.g., entire product line SKUs) you can't pack all reference images in bundle. Architecture:
- Server stores reference images + AR content
- When new marker is detected (by external ID in QR or cloud recognition) — load content for that specific marker
- Cloud Image Target (Vuforia Cloud, Wikitude Cloud): client sends frame to server, server returns marker ID and transform. Works for 100k+ image catalogs
Timeline
Basic image tracking with 1–10 markers, static 3D content — 3–5 days. Animated content, video overlay, catalog management via CMS — 2–3 weeks. Cloud solution for 100k+ markers — separate estimate. Cost calculated individually.







