Implementing AI Background Removal in a Mobile App
Background removal is one of few AI tasks that work on-device without quality loss. On iOS with Vision + Core ML, on Android with ML Kit or ONNX Runtime — result in 100–300 ms without network requests. Cloud APIs needed only for edge cases: fine hair details, transparent objects, complex textures.
On-device: iOS with Vision and Core ML
Apple added background removal in VisionKit starting iOS 16 via VNGenerateForegroundInstanceMaskRequest:
import Vision
import CoreImage.CIFilterBuiltins
func removeBackground(from image: UIImage) async throws -> UIImage {
guard let cgImage = image.cgImage else { throw BGRemovalError.invalidImage }
let request = VNGenerateForegroundInstanceMaskRequest()
let handler = VNImageRequestHandler(cgImage: cgImage)
try handler.perform([request])
guard let result = request.results?.first else { throw BGRemovalError.noResult }
// Get foreground mask
let maskBuffer = try result.generateScaledMaskForImage(forInstances: result.allInstances, from: handler)
// Apply mask to original image via Core Image
let ciImage = CIImage(cgImage: cgImage)
let mask = CIImage(cvPixelBuffer: maskBuffer)
let blendFilter = CIFilter.blendWithMask()
blendFilter.inputImage = ciImage
blendFilter.maskImage = mask
blendFilter.backgroundImage = CIImage.empty() // transparent background
guard let outputCI = blendFilter.outputImage,
let outputCG = CIContext().createCGImage(outputCI, from: outputCI.extent) else {
throw BGRemovalError.filterFailed
}
return UIImage(cgImage: outputCG)
}
VNGenerateForegroundInstanceMaskRequest works via Apple neural network, optimized for Neural Engine. On iPhone 13+ — 80–150 ms on camera photo. No internet, no API charges.
For iOS 15 and below — VNGeneratePersonSegmentationRequest (people only, not arbitrary objects):
let request = VNGeneratePersonSegmentationRequest()
request.qualityLevel = .accurate // or .balanced, .fast
request.outputPixelFormat = kCVPixelFormatType_OneComponent8
On-device: Android with ML Kit
class BackgroundRemover(private val context: Context) {
private val segmenter = Segmentation.getClient(
SelfieSegmenterOptions.Builder()
.setDetectorMode(SelfieSegmenterOptions.SINGLE_IMAGE_MODE)
.enableRawSizeMask()
.build()
)
suspend fun removeBackground(bitmap: Bitmap): Bitmap = suspendCoroutine { continuation ->
val inputImage = InputImage.fromBitmap(bitmap, 0)
segmenter.process(inputImage)
.addOnSuccessListener { result ->
val maskBitmap = result.buffer.toMaskBitmap(bitmap.width, bitmap.height)
val outputBitmap = applyMask(bitmap, maskBitmap)
continuation.resume(outputBitmap)
}
.addOnFailureListener { e -> continuation.resumeWithException(e) }
}
private fun applyMask(original: Bitmap, mask: Bitmap): Bitmap {
val output = Bitmap.createBitmap(original.width, original.height, Bitmap.Config.ARGB_8888)
val canvas = Canvas(output)
val paint = Paint(Paint.ANTI_ALIAS_FLAG)
canvas.drawBitmap(original, 0f, 0f, paint)
paint.xfermode = PorterDuffXfermode(PorterDuff.Mode.DST_IN)
canvas.drawBitmap(mask, 0f, 0f, paint)
return output
}
private fun ByteBuffer.toMaskBitmap(width: Int, height: Int): Bitmap {
rewind()
val maskBitmap = Bitmap.createBitmap(width, height, Bitmap.Config.ALPHA_8)
maskBitmap.copyPixelsFromBuffer(this)
return maskBitmap
}
}
ML Kit Selfie Segmenter optimized for people and selfies. For arbitrary objects — SubjectSegmenterOptions (available ML Kit 17+):
val options = SubjectSegmenterOptions.Builder()
.enableForegroundBitmap()
.build()
val segmenter = SubjectSegmentation.getClient(options)
Cloud APIs for complex cases
When on-device yields poor results (fine hair, transparency, complex background):
remove.bg API — specialized service, best quality for hair and fine details:
func removeBackgroundCloud(imageData: Data) async throws -> Data {
var request = URLRequest(url: URL(string: "https://api.remove.bg/v1.0/removebg")!)
request.httpMethod = "POST"
request.setValue(apiKey, forHTTPHeaderField: "X-Api-Key")
let boundary = UUID().uuidString
request.setValue("multipart/form-data; boundary=\(boundary)", forHTTPHeaderField: "Content-Type")
var body = Data()
body.appendMultipart(boundary: boundary, name: "image_file", filename: "photo.jpg",
contentType: "image/jpeg", data: imageData)
body.appendMultipart(boundary: boundary, name: "size", data: "auto".data(using: .utf8)!)
body.append("--\(boundary)--\r\n".data(using: .utf8)!)
request.httpBody = body
let (data, response) = try await URLSession.shared.data(for: request)
// Response is PNG with transparent background
return data
}
Clipdrop API (Stability AI) — supports not only people but also products, animals.
PhotoRoom API — specialized in product photography for e-commerce.
Architecture: on-device + cloud fallback
func removeBackground(_ image: UIImage) async -> UIImage {
// Try on-device first
if let result = try? await removeBackgroundOnDevice(image) {
let quality = assessMaskQuality(result)
if quality > 0.85 { return result } // Good quality mask
}
// Fallback to cloud
guard let imageData = image.jpegData(compressionQuality: 0.9) else { return image }
if let cloudResult = try? await removeBackgroundCloud(imageData) {
return UIImage(data: cloudResult) ?? image
}
return image
}
private func assessMaskQuality(_ image: UIImage) -> Double {
// Assess mask quality by edge smoothness
// Simple heuristic: ratio of semi-transparent to fully transparent pixels
// High ratio = fine details = good quality
guard let cgImage = image.cgImage else { return 0 }
// ... pixel mask analysis
return 0.9 // stub
}
Post-processing mask
After background removal, often need refinement:
-
Feathering (edge blur) —
CIGaussianBluron mask, radius 1–2 pixels - Erosion (mask shrinking) — removes edge pixel artifacts
- Hair refinement — cloud APIs better handle fine hair on-device
Timeline
On-device background removal (iOS VisionKit + Android ML Kit) with basic UI — 3–4 days. Cloud fallback + mask quality assessment + edge post-processing + PNG export — 8–10 days.







