AI-Powered Defect Detection on Production Lines via Mobile Camera
Visual quality control on production floors represents an area where mobile devices compete directly with stationary machine vision systems (Cognex, Keyence). Mobile inspection offers lower costs, greater flexibility, and eliminates the need for production line integration, but introduces distinct engineering challenges: unstable lighting conditions, variable distance to objects, and vibration from manual handling.
Challenge: Industrial Quality Control Specifics
Manufacturing defects vary dramatically by industry and product type:
| Industry | Typical Defects | Critical Size |
|---|---|---|
| Printed Circuit Boards (PCB) | Missing component, incorrect orientation, solder defects | 0.5–2 mm |
| Textiles | Snagging, pinholes, thread breakage | 1–5 mm |
| Steel & Metal Products | Scratches, porosity, inclusions | 0.1–3 mm |
| Glass/Ceramics | Chips, cracks, bubbles | 0.5–10 mm |
| Packaging | Missing label, print defects | >5 mm |
Each industry requires its own specialized model. Universal defect models don't work: what constitutes a defect on a PCB may be acceptable on metal.
On-Device Inference Architecture
For production deployment, on-device processing is paramount. Internet dependency is unacceptable on the factory floor. Model size is constrained by device RAM, and inference speed must meet production throughput requirements.
// iOS: industrial defect detection via CoreML
class DefectDetectionEngine {
private let model: VNCoreMLModel
private var confidenceThreshold: Float = 0.5 // adjusted on-site
private var iouThreshold: Float = 0.45
// Dedicated queue for stable FPS
private let inferenceQueue = DispatchQueue(
label: "defect.inference",
qos: .userInteractive
)
func analyze(sampleBuffer: CMSampleBuffer) async throws -> [DefectDetection] {
guard let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else {
throw DefectError.invalidFrame
}
return try await withCheckedThrowingContinuation { continuation in
inferenceQueue.async {
let request = VNCoreMLRequest(model: self.model) { req, error in
if let error = error {
continuation.resume(throwing: error)
return
}
let detections = (req.results as? [VNRecognizedObjectObservation])?
.filter { $0.confidence >= self.confidenceThreshold }
.map { obs in
DefectDetection(
type: DefectType(rawValue: obs.labels.first?.identifier ?? "") ?? .unknown,
confidence: obs.confidence,
boundingBox: obs.boundingBox, // normalized [0,1]
severity: self.classifySeverity(obs)
)
} ?? []
continuation.resume(returning: detections)
}
request.imageCropAndScaleOption = .scaleFill
let handler = VNImageRequestHandler(cvPixelBuffer: pixelBuffer)
try? handler.perform([request])
}
}
}
}
Performance Benchmarks. YOLOv8n in CoreML on iPhone 14 achieves approximately 15 ms per inference (66 FPS potential). YOLOv8s: ~25 ms. For production lines requiring >30 frames/second with handheld operation, the nano variant is recommended.
Image Stabilization for Handheld Scanning
Hand tremor is detrimental to detecting small defects. Several techniques mitigate this:
// Frame buffering + selection of sharpest frame
class StabilizedFrameSelector {
private var frameBuffer: RingBuffer<CMSampleBuffer> = RingBuffer(capacity: 8)
private var sharpnessScores: [Float] = []
func addFrame(_ buffer: CMSampleBuffer) {
let sharpness = computeLaplacianVariance(buffer)
frameBuffer.push(buffer)
sharpnessScores.append(sharpness)
}
// For analysis, select the frame with peak sharpness from the last N frames
var bestFrame: CMSampleBuffer? {
guard let maxIdx = sharpnessScores.indices.max(by: { sharpnessScores[$0] < sharpnessScores[$1] }) else { return nil }
return frameBuffer[maxIdx]
}
}
Additionally: configure AVCaptureDevice.activeVideoMinFrameDuration, set exposureMode = .continuousAutoExposure, and enable video stabilization via videoStabilizationMode = .cinematic.
Model Training and Fine-Tuning on Production Data
No off-the-shelf datasets exist for industry-specific production defects. Custom annotation is required.
The workflow:
- Capture 200–500 samples on the production floor (normal + defective specimens)
- Annotate in Label Studio or CVAT (bounding boxes + defect classes)
- Apply augmentation: brightness ±30%, rotation ±15°, horizontal flip, Gaussian noise to simulate real acquisition conditions
- Train YOLOv8s/m depending on speed requirements
- Convert to CoreML (.mlpackage) or TFLite format
- Perform iterative fine-tuning on production errors—typically every 2–4 weeks
# Fine-tuning on new production data
from ultralytics import YOLO
model = YOLO("defect_detection_v2.pt") # previous version as base
results = model.train(
data="production_defects.yaml",
epochs=50,
imgsz=640,
batch=16,
lr0=0.001, # lower learning rate for fine-tuning
freeze=10, # freeze first 10 backbone layers
augment=True,
hsv_h=0.015,
hsv_s=0.7,
degrees=10.0,
translate=0.1,
scale=0.5,
mosaic=1.0
)
Integration with Production Systems
The mobile inspector must log results to the facility's MES/ERP system:
// Android: submitting inspection results
data class InspectionResult(
val productId: String,
val batchId: String,
val inspectorId: String,
val timestamp: Instant,
val detections: List<DefectDetection>,
val verdict: InspectionVerdict, // PASS, FAIL, REVIEW
val imageUrl: String, // saved photo with annotations
val deviceId: String
)
suspend fun submitInspection(result: InspectionResult) {
// First, local queue (factory floor may lack Wi-Fi)
localQueue.enqueue(result)
// Synchronize when network becomes available
syncManager.triggerSync()
}
Offline-first architecture is critical: factory floor Wi-Fi is often unreliable, and losing inspection results is unacceptable.
Timeline Estimates
An MVP with a basic model (200–300 annotated samples), on-device inference, and local inspection history takes 3–4 weeks. A complete system with a fine-tuned model for the specific production process, image stabilization, offline-first MES/ERP synchronization, defect statistics dashboard, and iOS + Android support requires 2–3 months.







