ML Model Conversion to Core ML Format for iOS
Converting to .mlpackage from PyTorch or TensorFlow is a task with specific steps, each breakable for specific reasons. coremltools doesn't convert everything and not always correctly. Let's cover real conversion process including what goes wrong.
Model Preparation for Conversion
Before conversion, model must be in eval mode with fixed weights. torch.jit.trace requires example inputs — records execution graph for specific shape:
import torch
import coremltools as ct
model = MyModel()
model.load_state_dict(torch.load("weights.pth", map_location="cpu"))
model.eval()
# trace — fixes graph for specific shape
example_input = torch.zeros(1, 3, 224, 224)
traced_model = torch.jit.trace(model, example_input)
# For models with conditional branches (if/else dependent on data) — script:
# scripted_model = torch.jit.script(model) # analyzes code, not data
torch.jit.trace fails with dynamic branches — if forward() has if x.shape[1] > 512: ..., trace remembers only one branch. Use torch.jit.script analyzing all code statically.
Conversion via coremltools
mlmodel = ct.convert(
traced_model,
inputs=[ct.ImageType(
name="input",
shape=ct.Shape(shape=(1, 3, 224, 224)),
color_layout=ct.colorlayout.RGB,
# Normalization built into model — no need in Swift
bias=[-0.485/0.229, -0.456/0.224, -0.406/0.225],
scale=1/(255.0 * 0.229)
)],
outputs=[ct.TensorType(name="logits")],
compute_precision=ct.precision.FLOAT16,
minimum_deployment_target=ct.target.iOS16,
convert_to="mlprogram" # new .mlpackage format
)
# Add metadata
mlmodel.short_description = "Image classifier"
mlmodel.input_description["input"] = "RGB image 224x224"
mlmodel.output_description["logits"] = "Class probabilities"
mlmodel.save("MyModel.mlpackage")
convert_to="mlprogram" vs "neuralnetwork": mlprogram — new IR, supports FP16 ANE operations, requires iOS 15+. neuralnetwork — old format, iOS 12+. For new projects — mlprogram.
Common Conversion Errors
ValueError: TorchScript conversion failed — usually from non-standard operations. Check via:
# Unsupported operations:
print(ct.converters.mil.frontend.torch.ops.PYTORCH_OPS_REGISTRY.keys())
# If missing — need custom implementation
RuntimeError: Only tuple/list outputs are supported — model returns dict. Need wrapper:
class ModelWrapper(torch.nn.Module):
def __init__(self, model):
super().__init__()
self.model = model
def forward(self, x):
out = self.model(x)
if isinstance(out, dict):
return out["logits"] # extract needed key
return out
Unsupported op: aten::einsum — einsum in PyTorch. Rewrite via standard matmul/bmm, or add custom op.
ONNX as intermediate step — when direct conversion fails:
# PyTorch → ONNX → Core ML
torch.onnx.export(model, example_input, "model.onnx", opset_version=17)
mlmodel = ct.converters.onnx.convert(
model="model.onnx",
minimum_ios_deployment_target="15.0"
)
Conversion via ONNX sometimes resolves unsupported PyTorch operations.
Verifying Conversion Correctness
import numpy as np
import PIL.Image
# Load test image
img = PIL.Image.open("test.jpg").resize((224, 224))
# Original PyTorch output
transform = torchvision.transforms.Compose([
torchvision.transforms.ToTensor(),
torchvision.transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
])
tensor = transform(img).unsqueeze(0)
with torch.no_grad():
pytorch_out = model(tensor).numpy()
# Core ML output
coreml_out = mlmodel.predict({"input": img})["logits"]
# Compare
max_diff = np.max(np.abs(pytorch_out - coreml_out))
print(f"Max difference: {max_diff}")
# FP16 norm: < 0.01
# If > 0.05 — something wrong in conversion
If difference large — check normalization: bias and scale in ct.ImageType may not match pytorch transform normalization.
Variable Input Sizes
For models handling images of different sizes:
# Range of sizes
flexible_shape = ct.Shape(
shape=(1, 3, ct.RangeDim(min_val=64, max_val=1024), ct.RangeDim(min_val=64, max_val=1024))
)
# Or specific size set
enumerated_shapes = ct.EnumeratedShapes(
shapes=[
ct.Shape(shape=(1, 3, 224, 224)),
ct.Shape(shape=(1, 3, 384, 384)),
ct.Shape(shape=(1, 3, 512, 512)),
]
)
mlmodel = ct.convert(traced_model, inputs=[ct.TensorType(name="input", shape=enumerated_shapes)])
EnumeratedShapes lets Core ML pre-optimize graph for each size. RangeDim — more flexible, but optimization worse.
Custom Operations
If model contains operation coremltools doesn't know — add custom layer in Swift/Objective-C:
# Python: register custom operation
@ct.converters.mil.register_torch_op()
def my_custom_op(context, node):
# Implementation via MIL operations
x = context[node.inputs[0]]
result = mb.custom(params={"...": "..."}, inputs={"x": x}, ...)
context.add(result)
// Swift: custom layer implementation
import CoreML
@objc(MyCustomLayer)
class MyCustomLayer: NSObject, MLCustomLayer {
required init(parameters: [String: Any]) throws { }
func setWeightData(_ weights: [Data]) throws { }
func outputShapes(forInputShapes inputShapes: [[NSNumber]]) throws -> [[NSNumber]] { ... }
func evaluate(inputs: [MLMultiArray], outputs: [MLMultiArray]) throws { ... }
}
Custom layers always execute on CPU, without ANE/GPU acceleration. If custom op is inference bottleneck, better rewrite model with standard operations.
.mlpackage Integration in Xcode
File .mlpackage added to project — Xcode auto-generates Swift class with typed API. Filename MyModel.mlpackage → class MyModel with prediction(input:) method.
.mlpackage size in bundle affects App Store load time and app download size. For large models (>50 MB) — place outside bundle, download on first launch via URLSession, store in Application Support.
Process
Model compatibility audit → conversion with parameter tuning → numerical accuracy verification → testing on target iOS versions → profiling in Xcode Core ML Instrument.
Timeline Estimates
Standard model without non-standard operations — 3–7 days. Model with complex graph, custom operations, variable sizes — 2–4 weeks.







