AI-Based Object Measurement from Photos in Mobile Apps
Measuring cabinet width without leaving your chair sounds like marketing hyperbole. Behind it lies non-trivial geometry: camera calibration, depth estimation, reference object detection. Measurement accuracy without special acquisition conditions is 5–15%, with proper methodology it reaches 2–5%.
Two Fundamentally Different Approaches
ARKit/ARCore (LiDAR or SLAM) — precise but requires device support. iPhone 12 Pro and newer with LiDAR achieve 1–3 cm accuracy at distances up to 5 meters. ARCore on Android without LiDAR performs worse, with 3–8 cm error.
Monocular depth estimation — works on any device without LiDAR, using CNN to estimate depth from a single frame. MiDaS, DPT, Depth Anything V2 are current models. Accuracy is notably lower than LiDAR approaches, but sufficient for many applications.
// iOS: method selection based on device capabilities
func selectMeasurementMethod() -> MeasurementMethod {
if ARWorldTrackingConfiguration.supportsSceneReconstruction(.mesh) {
return .lidarARKit // iPhone 12 Pro+, iPad Pro
} else if ARWorldTrackingConfiguration.isSupported {
return .slamARKit // ARKit without LiDAR
} else {
return .monocularDepth // fallback to CoreML model
}
}
Implementation via ARKit
// Measuring distance between two points in AR
class ARMeasurementSession: NSObject, ARSessionDelegate {
var arView: ARSCNView!
private var startAnchor: ARAnchor?
private var endAnchor: ARAnchor?
func placePoint(at screenPoint: CGPoint) -> MeasurementPoint? {
// Raycast from screen into 3D world space
guard let query = arView.raycastQuery(
from: screenPoint,
allowing: .estimatedPlane,
alignment: .any
) else { return nil }
guard let result = arView.session.raycast(query).first else { return nil }
let worldPosition = result.worldTransform.columns.3 // position in meters
return MeasurementPoint(
position: SIMD3(worldPosition.x, worldPosition.y, worldPosition.z),
confidence: result.targetAlignment == .horizontal ? .high : .medium
)
}
func calculateDistance(from start: MeasurementPoint, to end: MeasurementPoint) -> Measurement<UnitLength> {
let diff = end.position - start.position
let distanceMeters = Double(simd_length(diff))
return Measurement(value: distanceMeters, unit: .meters)
}
}
A common mistake is not accounting for the fact that raycast performs better on well-textured surfaces. A white wall produces poor SLAM tracking results, causing AR markers to drift.
Displaying Measurements in AR
func addMeasurementLine(from start: SIMD3<Float>, to end: SIMD3<Float>,
distance: String) {
let midpoint = (start + end) / 2
// Line between points
let lineNode = SCNNode(geometry: createCylinder(from: start, to: end))
// Label with distance at midpoint
let labelNode = SCNNode(geometry: SCNText(string: distance, extrusionDepth: 0.001))
labelNode.position = SCNVector3(midpoint.x, midpoint.y + 0.02, midpoint.z)
labelNode.scale = SCNVector3(0.005, 0.005, 0.005)
labelNode.constraints = [SCNBillboardConstraint()] // always face camera
sceneRoot.addChildNode(lineNode)
sceneRoot.addChildNode(labelNode)
}
Reference Object Approach for Photo-Based Measurement
Without AR, a known-size object in the frame is required. A bank card (85.6 × 53.98 mm) serves as a convenient reference:
// Android: measurement via reference object
class ReferenceObjectMeasurer {
fun measureWithCard(bitmap: Bitmap, cardBoundingBox: RectF,
objectBoundingBox: RectF): MeasurementResult {
// Real card dimensions
val cardRealWidth = 85.6f // mm
val cardRealHeight = 53.98f
// Pixels → mm
val pixelsPerMmHorizontal = cardBoundingBox.width() / cardRealWidth
val pixelsPerMmVertical = cardBoundingBox.height() / cardRealHeight
// Perspective distortion correction (simplified)
val correctionFactor = estimatePerspectiveCorrection(
cardBoundingBox, imageDimensions = bitmap.width to bitmap.height
)
return MeasurementResult(
widthMm = (objectBoundingBox.width() / pixelsPerMmHorizontal) * correctionFactor,
heightMm = (objectBoundingBox.height() / pixelsPerMmVertical) * correctionFactor,
accuracy = MeasurementAccuracy.MODERATE // ±5-10% without calibration
)
}
}
Card detection in the frame uses ML Kit Object Detection or a custom YOLOv8 model (straightforward to train on 500 card images in various conditions).
Timeline Estimates
ARKit/ARCore measurement with basic UI (two points, distance) on a single platform takes 3–5 days. Complete implementation—LiDAR + SLAM fallback + monocular depth for older devices, area and perimeter measurement, measurement history, export, iOS + Android—requires 1–2 weeks.







