AI Recommendation System Implementation in Mobile Applications
Recommendations in mobile apps aren't one algorithm but a pipeline: collect behavioral events, send to ML model, get ranked list, integrate into UI without losing performance. Complexity depends on model location: on-device or server, and personalization depth.
Architecture: On-Device vs Server-Side
Server recommendation systems (Collaborative Filtering, Matrix Factorization, two-tower models) give better quality — model sees all users' behavior. Downside: network latency and no offline. Client-side (CoreML/TFLite) — faster, more private, works offline. Downside: limited context (only this device's data) and harder model updates.
Typical hybrid: server generates personal list of 100–200 candidates daily, mobile client stores locally and re-ranks in real-time by recent session events.
Event Collection — Quality Foundation
Recommendation system is only as good as data. On mobile client, log minimum:
-
item_view— viewed object (with dwell time, not just impression) -
item_click— tap on object -
item_purchase/item_save— conversion action -
item_skip— scrolled past (important negative signal)
// Android: event logger with batching
class RecoEventLogger(private val api: RecoApi) {
private val buffer = mutableListOf<RecoEvent>()
private val flushInterval = 30_000L // 30 seconds
fun log(event: RecoEvent) {
buffer.add(event.copy(timestamp = System.currentTimeMillis()))
if (buffer.size >= 20) flush() // or by timer
}
private fun flush() {
if (buffer.isEmpty()) return
val batch = buffer.toList()
buffer.clear()
viewModelScope.launch(Dispatchers.IO) {
runCatching { api.sendEvents(batch) }
// On error — store in Room for retry
}
}
}
Important: dwell_time — often overlooked signal. Track when card enters viewport (RecyclerView.OnScrollListener or LazyList.onVisibleItemsChanged) and leaves. View under 2 seconds — likely scroll-by.
CoreML / TFLite Re-ranking On-Device
If server returns top-200 candidates, final ranking can be done on-device. Eliminates extra network request on screen open.
On iOS with CoreML:
// Load model (bundled or Core ML Model Deployment)
let model = try MLModel(contentsOf: modelURL)
let input = RerankerInput(
userVector: userEmbedding, // Float32 array 64d
itemVectors: itemEmbeddings, // [Float32 array 64d]
sessionFeatures: sessionContext // last 10 actions
)
let output = try model.prediction(from: input)
let scores = output.featureValue(for: "scores")?.multiArrayValue
TensorFlow Lite on Android — via Interpreter with ByteBuffer input. For models > 10 MB use GPU delegate (GpuDelegate) — 3–8x speedup on flagships.
Update model without app release: iOS — Core ML Model Deployment via CloudKit or own CDN with MLModel.compileModel(at:). Android — Firebase ML with RemoteModel or direct .tflite download to filesDir with hash verification.
Cold Start for New Users
First 5–10 sessions lack data for personalization. Standard approach — hybrid:
- Onboarding quiz (2–3 preference questions) gives initial profile
- Popularity-based as fallback
- Implicit feedback from first interactions quickly shifts profile
Don't show "recommendations for you" until minimum history — more honest and doesn't break quality metrics.
Production Quality Metrics
Click-Through Rate (CTR) and conversion are basic. For mobile UX, also track "recommendation blindness": if block is ignored, worse than low CTR. A/B-testing via Firebase Remote Config or Amplitude Experiment — mandatory on algorithm changes. Minimum sample for statistical significance — 1000+ unique users per variant.
Implementation Process
Audit current data and events. Choose architecture (on-device / server / hybrid). Develop event tracker with batching and retry. Server component or integrate ready ML service (Amazon Personalize, Google Recommendations AI). Integrate model into mobile client, re-ranking. UI components for recommendation blocks. A/B-testing and analytics.
Timeline Guidelines
Integrate ready server recommendation service with event tracker — 2–3 weeks. Hybrid system with on-device re-ranking, custom events, and A/B-testing — 6–10 weeks.







