Implementing AI Face Aging/Rejuvenation in a Mobile App
Age and rejuvenation effects — one of few AI features where on-device processing truly competes with server. Specialized models (SAM — Style-based Age Manipulation, FRAN — Face Re-Aging Network) have compact distilled versions. FaceApp historically built exactly on on-device inference — hence instant UI response.
On-device: FRAN via CoreML
FRAN (Face Re-Aging Network from Netflix Research) — open-source model trained on synthetic data. Takes face image + target age, returns stylized result. CoreML-converted version weighs ~45 MB in FLOAT16.
import CoreML
import Vision
class FaceAgingProcessor {
private let model: FRAN
func process(faceImage: CGImage, targetAge: Int) async throws -> CGImage {
// FRAN accepts normalized 256x256 image
let resized = try resize(image: faceImage, to: CGSize(width: 256, height: 256))
let input = FRANInput(
face_image: try MLMultiArray(from: resized),
target_age: MLMultiArray([Float(targetAge) / 100.0]) // normalize 0..1
)
let output = try await model.prediction(input: input)
return try cgImage(from: output.output_face)
}
}
On iPhone 13+ with Neural Engine, inference time — 60–90 ms. Enables live preview when dragging age slider. iPhone X (A11 Bionic) — around 200 ms, still acceptable for interactive slider with 150ms debounce.
Detection and alignment — critical step
FRAN result quality strongly depends on precise face alignment before inference. Standard pipeline:
-
VNDetectFaceLandmarksRequest— get 76 points (iOS) or MediaPipe Face Mesh (468 points) on Android - Compute affine transformation via 5 key points (eyes, nose, mouth corners)
- Warp transformation via
vImage(iOS) or OpenCV on Android - After inference — inverse transformation + Poisson blending on face mask
Without alignment, model shows visible artifacts with any head tilt >15°. Most common reason for poor results in cheap implementations.
Poisson Blending on iOS
Standard CIBlendWithMask gives hard mask edge. For smooth transition — Poisson Image Editing. iOS has no built-in method, so either Metal shader or Accelerate Framework with linear system solve. Second option slower, but doesn't require GLSL.
Server path: when higher quality needed
For apps where photorealism matters (e.g., age prediction in medical/insurance context), server models like SAM2 or StyleGAN-based give significantly better results:
- Replicate:
yuval-alaluf/sam— 10–20 seconds, high quality - Custom backend on A100: ~2–3 seconds, full model control
API call standard: multipart/form-data with image and target_age parameter. Result — link to processed file.
Combination: on-device preview + server export
Best UX for user: instant on-device preview at 256×256 when moving age slider, and "Save" button launches server processing at original resolution. While server works — show animation. Result saved to Camera Roll via PHPhotoLibrary.
Android: WorkManager for server request — survives app minimize. Notification on completion.
Privacy and App Store
Apps with age transformations passed review without issues — no restrictions like face swap. But if photo uploaded to server — mandatory Privacy Nutrition Label with Photos, usage App Functionality. Add NSPhotoLibraryUsageDescription and NSCameraUsageDescription with specific description.
Delete original photos from server immediately after processing.
Timeline
On-device FRAN integration with alignment and blending — 5–8 days. Hybrid mode (on-device preview + server export) — 2–3 weeks. Cost calculated after clarifying requirements.







