On-device ML model Core ML integration for offline AI in iOS app

NOVASOLUTIONS.TECHNOLOGY is engaged in the development, support and maintenance of iOS, Android, PWA mobile applications. We have extensive experience and expertise in publishing mobile applications in popular markets like Google Play, App Store, Amazon, AppGallery and others.

8+Years of workmore info 900+Completed projectsmore info 100+In house employeesmore info 19+Partnersmore info

Development and support of all types of mobile applications:

Information and entertainment mobile applications

News apps, games, reference guides, online catalogs, weather apps, fitness and health apps, travel apps, educational apps, social networks and messengers, quizzes, blogs and podcasts, forums, aggregators

E-commerce mobile applications

Online stores, B2B apps, marketplaces, online exchanges, cashback services, exchanges, dropshipping platforms, loyalty programs, food and goods delivery, payment systems.

Business process management mobile applications

CRM systems, ERP systems, project management, sales team tools, financial management, production management, logistics and delivery management, HR management, data monitoring systems

Electronic services mobile applications

Classified ads platforms, online schools, online cinemas, electronic service platforms, cashback platforms, video hosting, thematic portals, online booking and scheduling platforms, online trading platforms

These are just some of the types of mobile applications we work with, and each of them may have its own specific features and functionality, tailored to the specific needs and goals of the client.

Offered services

Showing 1 of 1 servicesAll 1735 services

On-device ML model Core ML integration for offline AI in iOS app

Complex

~1-2 weeks

FAQ

Our competencies:

Free consultation

Book a free consultation if you have any questions. A dedicated specialist will advise you.

Cost calculation

If you know what exactly you need to develop, or you already have a ready-made technical task.

Development stages

Latest works

Development of a mobile application for FEEDME
761
Development of a mobile application for XOOMER
649
Development of a mobile application for RHL
1071
Development of a mobile application for ZIPPY
947
Development of a mobile application for Affhome
884
Development of a mobile application for the FLAVORS company
466

Show more works

On-Device ML Model Integration (Core ML) for Offline AI in iOS Apps

Core ML is not simply "run model on iPhone." It is a specific path from PyTorch/TensorFlow weights to calling .prediction() in a SwiftUI app, where each step has nuances that cost a week of work if unknown beforehand.

Model Conversion: coremltools

Most modern models arrive as PyTorch checkpoint or ONNX file. Convert via coremltools (Apple, Python package):

import coremltools as ct
import torch

# Say we have PyTorch image classification model
model = MyModel()
model.load_state_dict(torch.load("model.pth"))
model.eval()

# Tracing—pass example input data
example_input = torch.zeros(1, 3, 224, 224)
traced = torch.jit.trace(model, example_input)

# Convert
mlmodel = ct.convert(
    traced,
    inputs=[ct.ImageType(
        name="input_image",
        shape=(1, 3, 224, 224),
        color_layout=ct.colorlayout.RGB,
        bias=[-0.485/0.229, -0.456/0.224, -0.406/0.225],  # ImageNet normalization
        scale=1/(255.0 * 0.229)  # built into model, no need in Swift
    )],
    outputs=[ct.TensorType(name="class_probabilities")],
    compute_precision=ct.precision.FLOAT16,  # for ANE
    minimum_deployment_target=ct.target.iOS16
)

mlmodel.save("MyClassifier.mlpackage")

ct.precision.FLOAT16 + minimum_deployment_target=iOS16 means Core ML actively uses ANE (Apple Neural Engine). On iPhone 14 this is 4–8× faster than GPU for inference, while battery consumption much lower. On iOS 15 same model runs via Metal GPU.

ct.ImageType with built-in normalization—no need to convert UIImage to normalized FloatArray in Swift, Core ML does it.

Common Conversion Problems

Dynamic shapes—models with torch.Size([batch, seq_len, hidden]) where seq_len unfixed break torch.jit.trace. Solution: ct.RangeDim for variable sizes or multiple configs via ct.EnumeratedShapes.

# Variable sequence length
flexible_shape = ct.Shape(shape=(1, ct.RangeDim(1, 512), 768))
mlmodel = ct.convert(model, inputs=[ct.TensorType(shape=flexible_shape)])

Unsupported operations—e.g., custom CUDA kernels. coremltools throws NotImplementedError. Path: either rewrite operation on standard PyTorch primitives, or add custom ct.op layer via C++/Swift extension.

Error Unsupported model format on x86 simulator—simulator uses CPU fallback, some FLOAT16 operations unsupported. Test accuracy only on real device.

Loading and Running on iOS

import CoreML
import Vision

// Load model (once at startup)
let config = MLModelConfiguration()
config.computeUnits = .all  // ANE + GPU + CPU

// .mlpackage loaded from bundle
guard let modelURL = Bundle.main.url(forResource: "MyClassifier", withExtension: "mlpackage"),
      let model = try? MyClassifier(contentsOf: modelURL, configuration: config) else {
    fatalError("Failed to load model")
}

// Inference on background thread
DispatchQueue.global(qos: .userInitiated).async {
    do {
        let input = MyClassifierInput(input_image: cgImage)
        let output = try model.prediction(input: input)
        let probs = output.class_probabilities
        // probs—MLMultiArray, get value: probs[0].doubleValue
    } catch {
        print("Inference error: \(error)")
    }
}

Model loads ~100–300 ms (depends on size). Don't load in viewDidLoad—only once at app startup or first use, keep in memory while needed.

Vision Framework as Wrapper

For computer vision tasks VNCoreMLRequest more convenient—Vision handles input resizing, image orientation, coordinate transforms:

let coreMLModel = try VNCoreMLModel(for: model.model)  // .model—MLModel from generated class

let request = VNCoreMLRequest(model: coreMLModel) { request, error in
    guard let results = request.results as? [VNClassificationObservation] else { return }
    let topResult = results.sorted { $0.confidence > $1.confidence }.first
    print("\(topResult?.identifier ?? "?") — \(topResult?.confidence ?? 0)")
}
request.imageCropAndScaleOption = .centerCrop  // or .scaleFit

let handler = VNImageRequestHandler(cgImage: inputCGImage, options: [:])
try handler.perform([request])

VNCoreMLRequest automatically handles input size mismatch—pass any image, Vision resizes to model's expected size. Without Vision would need to do manually via vImage or CIImage.

Performance: Benchmarks

Device	Model	computeUnits	Inference Time
iPhone 14 Pro	MobileNetV3 (5 MB FP16)	.all (ANE)	2–4 ms
iPhone 14 Pro	ResNet-50 (48 MB FP16)	.all (ANE)	8–15 ms
iPhone 12	BERT-base (350 MB FP16)	.all	180–250 ms
iPhone SE 2nd gen	MobileNetV3 (5 MB FP16)	.cpuOnly	12–20 ms

Xcode Instruments → Core ML Instrument for profiling real ANE/GPU/CPU usage.

Update Model Without App Update

Core ML supports loading model from any URL, not just bundle. Allows server model updates:

// Load mlpackage from documents directory
let documentsURL = FileManager.default.urls(for: .documentDirectory, in: .userDomainMask)[0]
let downloadedModelURL = documentsURL.appendingPathComponent("updated_model.mlpackage")

if FileManager.default.fileExists(atPath: downloadedModelURL.path) {
    let model = try MyClassifier(contentsOf: downloadedModelURL, configuration: config)
} else {
    // Fallback to bundle
}

Load model over network via URLSession, save to Documents, verify via SHA256 hash before use.

Process

Get weights → convert with precision and compute units tuning → profile on target devices → integrate in app with fallback and error handling → optionally: remote model update.

Timeline Estimates

Convert existing model + basic iOS integration takes 1–2 weeks. Complex model with non-standard operations, multiple inputs/outputs, remote update requires 3–5 weeks.