Face Recognition Implementation in Mobile Applications
Face recognition in a mobile app is technically one of the most complex computer vision tasks, legally one of the most sensitive. Incorrectly implemented anti-spoofing opens vulnerability to bypass via photograph. Non-compliance with GDPR or local regulations becomes a regulatory issue. We address both.
Face Recognition System Architecture
The recognition pipeline consists of three independent steps:
- Detection — find face in frame, get bounding box and landmarks.
- Verification/Identification — get face embedding (128- or 512-dimensional vector) and compare with reference base.
- Anti-Spoofing — ensure it's a live person, not photo/video/mask.
Skipping the third step creates a system bypassed by any printed photograph.
Detection and Landmarks
On iOS: VNDetectFaceLandmarksRequest from Vision framework returns VNFaceObservation with landmarks (76 points: face contour, eyebrows, nose, lips, eyes) and boundingBox. On-device, no network, ~8–15 ms on iPhone 12.
On Android: ML Kit Face Detection with FaceDetectorOptions.ACCURATE returns FirebaseFace with 468 points when setContourDetectionEnabled(true) is enabled—full face mesh, not just keypoints. Heavier but needed for precise face alignment before embedding.
Face alignment before embedding inference is critical. Without eye-aligned alignment, face recognition accuracy drops 15–25%. Geometrically: find eye centers, compute rotation angle, affine transform to standard position (eyes at 1/3 from top, symmetric).
Embedding and Comparison
Standard choices: FaceNet (128D) or ArcFace (512D). FaceNet available as TFLite model. ArcFace more accurate but heavier. For mobile: FaceNet INT8 — 12 MB, ~35 ms inference on Pixel 6.
Cosine distance between vectors is the main metric. Threshold for "same face": typically cosine similarity > 0.75. Threshold is dataset-specific, not universal.
Store reference embeddings in encrypted Keychain (iOS) or EncryptedSharedPreferences / Android Keystore (Android). Never store original photos. Embeddings are theoretically irreversible; photos are not.
Anti-Spoofing: Critical
Two approaches:
Passive — model analyzes skin texture and optical artifacts of photo/screen. MiniFASNet, Silent-Face-Anti-Spoofing. Works without user action but weaker against 3D masks.
Active — challenge-response: "blink," "turn head left." Implementation: sequence of VNDetectFaceLandmarksRequest with Eye Aspect Ratio (EAR) change analysis for blink detection, or Head Pose Estimation via VNFaceObservation.yaw/roll/pitch.
For banking and fintech apps, combine: passive anti-spoofing + active challenge. For corporate access, passive suffices.
Regulatory Requirements
Biometric data (face embedding is biometrics per GDPR article 9 and local regulations) requires explicit user consent separate from general ToS. Storing embeddings in cloud requires encryption in transit and at rest, plus DPA with provider. Apps in regulated jurisdictions face additional data localization requirements.
App Store Review Guidelines section 5.1.1 explicitly forbids collecting biometrics without permission. Review rejection on this point is common.
Timeline
Detection + identification on-device without anti-spoofing: 1–2 weeks. Full pipeline with anti-spoofing, encrypted reference storage, and compliance audit: 3–4 weeks. Cost calculated individually.







