Machine Learning Development (ML Kit) in Mobile Applications
ML Kit from Google is a Firebase SDK working on Android and iOS. Ready-made APIs cover most common tasks: OCR, face detection, barcode scanning, translation. But behind apparent simplicity lie nuances that emerge not in documentation but in production.
Common Problems with ML Kit
Ready-made APIs (Text Recognition v2, Face Detection, Barcode Scanning) work correctly if input image requirements are met. Face Detection with FaceDetectorOptions.PerformanceMode.ACCURATE returns results in 80–150 ms on Pixel 6, but on budget devices with Snapdragon 680, it's 400+ ms. Using FAST mode drops accuracy when head rotation exceeds 30°.
On iOS, MLKitFaceDetection through VisionImage(image:) loses image orientation if image.orientation isn't set explicitly from UIImage.imageOrientation. No crash—faces simply aren't detected when phone is horizontal.
With custom TFLite models via CustomImageLabeler, proper metadata packing is critical. Without TFLiteMetadataHelper, the model doesn't know input normalization—either add metadata via flatbuffers or specify normalization manually through CustomRemoteModel options.
Our Approach
Choosing between On-Device and Cloud API is the first question. On-Device works offline, faster, without API call costs. Cloud is more accurate for complex cases (multilingual OCR, non-standard fonts). For most B2C apps, a hybrid approach is optimal: on-device first, cloud as fallback when confidence is low.
Real case: receipt scanning app. ML Kit Text Recognition v2 on-device gave 94% accuracy on standard receipts, but 67% on thermal paper with faded text. We added preprocessing via CIFilter (contrast boost, binarization) before passing to VisionImage—accuracy jumped to 89% without switching to Cloud API.
On Android, integration goes through BarcodeScanning.getClient() or TextRecognition.getClient(TextRecognizerOptions.DEFAULT_OPTIONS). Models download automatically via Play Services on first run—factor this into UX: initial inference may take seconds while model loads. Use ModuleInstallClient for explicit preload during onboarding.
For custom models, use FirebaseModelDownloader with ModelDownloadType.LOCAL_MODEL_UPDATE_IN_BACKGROUND. Model updates in background; the app uses current version until next launch.
Supported ML Kit APIs
| API | Mode | Platforms |
|---|---|---|
| Text Recognition v2 | On-Device | Android, iOS |
| Face Detection | On-Device | Android, iOS |
| Barcode Scanning | On-Device | Android, iOS |
| Image Labeling | On-Device + Cloud | Android, iOS |
| Object Detection & Tracking | On-Device | Android, iOS |
| Translation | On-Device | Android, iOS |
| Custom Model (TFLite) | On-Device | Android, iOS |
Process and Timeline
Requirements audit → API selection (ready-made vs custom) → SDK integration → preprocessing configuration → testing on target devices → production accuracy monitoring.
Integrating one ready-made API (e.g., Barcode Scanning or Face Detection): 2–4 business days. Custom TFLite model with preprocessing and fallback logic: 1–2 weeks. Cost calculated individually.







