AI Virtual Makeup Try-On in Mobile Apps
Sephora Virtual Artist, Perfect Corp YouCam Makeup, MAC Virtual Try-On — these are the competitors. User points front-facing camera, system overlays shadows, lipstick, concealer in real-time with anatomy-precise fit to specific face. Technically: face mesh, lips/eyes/skin segmentation, and correct AR rendering accounting for lighting and skin texture.
Face Mesh as Foundation
iOS ARKit. ARFaceTrackingConfiguration — TrueDepth camera (iPhone X+) builds face mesh from 1220 vertices. Returns ARFaceAnchor with geometry (ARFaceGeometry), blend shapes (52 coefficients), and face transform. Depth camera provides accurate geometry even during movement. For makeup try-on use face geometry as "canvas": UV-map makeup textures onto mesh.
Android ML Kit Face Mesh. 468 points (MediaPipe FaceMesh underneath), RGB camera without depth. Lower accuracy on side angles or fast movement. For makeup sufficient — lips, eyes, cheekbones covered by landmarks at needed resolution.
MediaPipe Face Landmarker (cross-platform). Use directly via C++ native (Android NDK / iOS framework). 478 points including iris — for eye makeup and lenses. Good solution for cross-platform projects.
Rendering Makeup Over Face
Main technical challenge — not finding lip contours but making lipstick look real, not like colored rectangle over face.
Texture UV-mapping. Face mesh has UV coordinates (ARKit documented on Apple Developer). Makeup texture (drawn by artist on neutral UV-layout face) overlays as MTLTexture with alpha blending. Product color changes via HSV transformation of texture in fragment shader.
Physically-correct Rendering. Lipstick is glossy — needs specular component. Shadows are matte — diffuse. PBR material parameters metallic, roughness change by product type. Estimate lighting via ARLightEstimate.ambientIntensity and ARDirectionalLightEstimate — adapt specular to real frame lighting.
Segmentation for Accurate Application. Lipstick only on lips, shadows only on eyelids — can't rely on face mesh vertices alone, need pixel-level segmentation. Use CoreML model (convert MediaPipe Selfie Segmentation or train custom on DeepLab/U-Net) for face zone segmentation. Inference each frame: A15 Bionic handles it in 8–15 ms, older devices every 3 frames with mask interpolation.
Color Accuracy
Key requirement for beauty brands: app color must match real product color. Problem: smartphone camera applies auto white balance, ISO normalization — colors on screen aren't accurate. Solution:
- Color calibration via
AVCaptureDevice.whiteBalanceGains— lock white balance during try-on - ColorChecker-based calibration (optional, professional cases)
- Product colors in database stored in Lab color space (perceptually uniform), convert to sRGB for display accounting for display profile (
UIScreen.traitCollection.displayGamut)
Recording and Sharing Results
User wants to record video with try-on. ReplayKit (iOS) or MediaProjection (Android) for screen recording. Or implement custom recording: write each frame to AVAssetWriter (iOS) / MediaCodec (Android). MP4 H.264, 1080p@30fps sufficient.
Performance and Supported Devices
| Scenario | Minimum Device |
|---|---|
| Basic face mesh + lipstick | iPhone 8 / Android 2018 mid-range |
| ARKit depth + full makeup | iPhone X+ |
| Realtime PBR + segmentation | iPhone 12+ / Snapdragon 888+ |
For devices without TrueDepth use RGB-only pipeline with MediaPipe — slightly worse visually on sharp movements, but acceptable for majority of users.
Timeline: MVP with lipstick + eye shadow on iOS via ARKit — 6–9 weeks. Cross-platform system with full product catalog, PBR rendering, color accuracy, and sharing — 4–7 months. Cost calculated individually.







