Smart Reply Implementation in Mobile Applications
Smart Reply is automatically suggested replies to messages: three buttons under chat with options like "Okay", "In 6pm", "Can't". Google built this in Gmail and Android Messages, Apple in iMessage via iOS 17. For custom app, either use ready SDKs or implement own model.
ML Kit Smart Reply: Quick Start
Google ML Kit includes ready SmartReply model — works on-device, supports English. For Android:
val smartReply = SmartReply.getClient()
val conversation = messages.takeLast(10).map { msg ->
if (msg.isFromUser) {
TextMessage.createForLocalUser(msg.text, msg.timestamp)
} else {
TextMessage.createForRemoteUser(msg.text, msg.timestamp, msg.senderId)
}
}
smartReply.suggestReplies(conversation)
.addOnSuccessListener { result ->
if (result.status == SmartReplySuggestionResult.STATUS_SUCCESS) {
val suggestions = result.suggestions.map { it.text }
showSuggestions(suggestions)
}
}
.addOnFailureListener { /* no suggestions — hide UI */ }
Model doesn't generate text, selects from pre-trained reply templates. Plus — very fast (< 20 ms). Minus — limited template set, doesn't account for app specifics, English only.
For iOS — analog via NaturalLanguage framework or Apple Intelligence API (iOS 18+), but Smart Reply support as separate use-case much more modest.
Custom Smart Reply Based on LLM
For Russian-speaking apps and specific domains (support, medicine, b2b) ML Kit won't work. Need LLM with prompt.
func generateReplySuggestions(
lastMessages: [ChatMessage],
count: Int = 3
) async -> [String] {
let context = lastMessages.suffix(5)
.map { "\($0.role): \($0.text)" }
.joined(separator: "\n")
let prompt = """
You help user quickly reply to message in chat.
Conversation history:
\(context)
Suggest \(count) short reply options for user.
Each reply is one sentence, max 10 words.
Format: JSON array of strings.
"""
let response = try await llmClient.complete(prompt: prompt, maxTokens: 100)
return parseJSONArray(response) ?? []
}
Latency is critical. GPT-4o mini responds in 1–2 seconds, acceptable. Pre-generate while user reads message — by time they want to reply, options ready.
UX: When to Show and Hide
Smart Reply should appear only on incoming message, disappear as soon as user starts typing. Three suggestions optimal (Google Research). More overwhelms, less limits choice.
// Android: toggle between Smart Reply and input field
editText.addTextChangedListener(object : TextWatcher {
override fun onTextChanged(s: CharSequence?, start: Int, before: Int, count: Int) {
smartReplyChips.isVisible = s.isNullOrEmpty()
// Hide suggestions as soon as user types
}
override fun afterTextChanged(s: Editable?) {}
override fun beforeTextChanged(s: CharSequence?, start: Int, count: Int, after: Int) {}
})
Chips (horizontal scroll) — standard UI for reply suggestions. MaterialChip in Android, custom Button / Chip components in SwiftUI.
Context Adaptation
Common mistake — same suggestions for all message types. "Okay", "Got it", "Thanks" — universal replies users quickly stop noticing. Context-aware Smart Reply should understand:
- Question → offer direct answer
- Meeting request → "Yes, works", "No, can't", "Suggest another time"
- Thanks → "Welcome", "No problem"
- Info message → "Got it", "Noted", "Will clarify"
Classify message type (question / request / info) — separate light classifier or part of LLM prompt.
Implementation Process
Choose approach: ML Kit for English vs LLM for custom scenarios. Implement pre-loading suggestions on incoming message. UI components: chips, appear/hide animations. Logic for hiding on input start. Analytics: % users using Smart Reply, which variant most popular.
Timeline Guidelines
Smart Reply via ML Kit (Android, English) — 1–2 days. Custom Smart Reply on LLM with context classification and analytics — 5–8 days.







