Object Classification Implementation in Mobile Applications
Object classification differs from detection in one key way: the model answers "what is this?" not "where is it?" One output: a probability vector across classes. Seems simpler than detection, but here is where confidence threshold and UX problems most often arise.
Model Selection for the Task
For top-1000 classes (products, animals, household objects), use MobileNetV3, EfficientNetB0/B1. Work out-of-the-box via ML Kit Image Labeling or Core ML with models from Apple Model Gallery. For narrow domain (specific product type, manufacturing defects), you need a fine-tuned model.
Fine-tuning on custom dataset: take a pre-trained backbone (MobileNetV2, EfficientNetB0), freeze lower layers, train only upper layers on your data. For 10–50 classes, 200–500 examples per class with proper augmentation suffice. Fewer requires few-shot approach (Prototypical Networks).
After training: convert to .mlmodel (iOS) or .tflite (Android), add metadata with class names and normalization parameters.
Confidence Thresholds: Where Products Get Lost
The most common UX mistake: showing classification results without considering confidence. The model always returns a probability distribution; argmax always yields a "winner"—even when the model doesn't believe it. If top-1 class has score 0.23 with next at 0.21, that's not classification, it's random.
Correct approach: set a threshold (typically 0.5–0.7 depending on task). If top-1 below threshold, show "couldn't determine" or request re-shot. For critical tasks (medicine, legal documents), additionally check distribution entropy.
On iOS via VNCoreMLRequest:
request.imageCropAndScaleOption = .centerCrop
let observations = results as? [VNClassificationObservation]
let confident = observations?.filter { $0.confidence > 0.65 }
On Android via ML Kit ImageLabeling:
val options = ImageLabelerOptions.Builder()
.setConfidenceThreshold(0.65f)
.build()
Top-N and Result Display
Showing top-3 classes with percentages is right for educational and consumer apps. For business apps (automation, warehouse), one confident result or nothing.
Case: warehouse inventory app classifying 87 SKUs via custom EfficientNetB0. Initial threshold 0.5 gave 12% false positives. Analyzing confusion matrix revealed: 80% errors between SKUs with similar packaging. Added second level: if top-2 and top-3 combined exceed 0.4, ask operator to confirm. False positives dropped to 2.1%.
UI result: don't overload users with numbers. Progress bar or color coding (green/yellow/red) reads better than "73.4%". Result appearance animation via withAnimation (SwiftUI) or ObjectAnimator (Android) reduces the feeling of a "cold" model response.
Timeline and Process
Integrating a ready model into existing app: 3–5 days. Fine-tuning custom model + integration: 1–2 weeks. Cost calculated individually.







