AI Virtual Hairstyle Try-On in Mobile Apps
Virtual hairstyle try-on is more complex than makeup try-on — hair isn't a flat texture on face but a 3D object with thousands of strands that moves when head turns, correctly covers shoulders, and responds to lighting. Implementations that paste 2D hairstyle images on head are obviously fake — needs either 3D hair mesh or neural synthesis.
Two Fundamentally Different Approaches
3D Hair Mesh + Face Tracking
Works in real-time. Face tracking (ARKit ARFaceAnchor or MediaPipe) determines head position and orientation. 3D hairstyle model — mesh with skeleton, anchored to head transform. When head rotates, mesh rotates with it.
Technical challenges:
-
Hair physics. Static mesh looks plastic — need strand motion simulation.
SCNPhysicsBodyper strand — performance disaster. Solution: vertex shader simulation via Metal — each strand is a spline with control points, simulate spring dynamics on GPU -
Hair-to-face occlusion. Bangs should cover forehead. Create mask from depth face geometry — pixels behind face plane are culled. On ARKit with TrueDepth: use
ARFaceGeometryas occluder geometry withSCNMaterial.colorBufferWriteMask = [] - Size fitting. Hairstyle should sit on specific user's head. Use interpupillary distance as baseline for model scaling
Neural Image Synthesis Approach
Not real-time — for static photo or "try on from photo" mode. Image-to-image model: input photo + selected hairstyle → synthesized photo with new hairstyle. Quality significantly higher than 3D mesh, but 1–5 second latency.
Implementation: Core ML (iOS) or TFLite (Android) with converted model. Models: SAM (Segment Anything Model) for head segmentation + Stable Diffusion Inpainting to generate new hairstyle in segmented zone. Or specialized models — HairCLIP, HairstyleGAN.
Server inference: models 500 MB–2 GB don't suit on-device. Send photo to GPU server (A100 / H100), get result in 1–3 seconds. Streaming parts doesn't apply (whole image), but progress indicator is mandatory.
Hair Segmentation
Both approaches need accurate hair segmentation on input frame — separate hair from background and face. Models: MediaPipe Hair Segmentation, DeepLabV3+ fine-tuned on hair dataset, BiSeNet. Quality metric — mIoU on test dataset, acceptable from 85%+.
On iOS — convert to CoreML via coremltools, inference via VNCoreMLRequest. Important: segmentation mask must update every frame for real-time mode — need model with inference < 20 ms on A15. U-Net lite or MobileNetV3-based segmentor handles it.
UX and Hairstyle Catalog
Hairstyle catalog — 3D models (for mesh approach) or reference images (for neural approach). Filters: length, color, type (straight/curly/wavy). Hairstyle color selection: change vertex color / albedo texture via HSV transformation. Same mesh — different colors — no separate models per variant.
Unconventional hair coloring (ombre, highlights) — separate task: gradient vertex color along strand length, special UV layout for gradient map.
Timeline
Real-time 3D hair mesh try-on with face tracking for iOS: 8–12 weeks. Neural synthesis with server inference for static photos + on-device segmentation: 6–10 weeks. Combined product (real-time 3D + neural for final result) with cross-platform: 4–7 months. Cost calculated individually.







