Implementing AI-Powered Search Autocomplete in Mobile Applications
Autocomplete is one of the most latency-demanding features in mobile apps. Users expect suggestions faster than they can notice them appear: ideally < 100 ms from typing a character to suggestions appearing. Meanwhile, queries should be relevant, not just popular.
Why simple prefix search doesn't work
Naive implementation: store frequent queries in a dictionary and search by prefix. Works for "nike" → "nike sneakers," but breaks for:
- Spelling errors: "naik" instead of "nike"
- Transliteration: "krossovki" vs "кроссовки"
- Semantically close queries: "running shoe" when typing "sneak"
- Personalization: the same query "dress" should suggest different top results for different users
Architecture of production-ready autocomplete
Trie + fuzzy search for speed
Base layer: Trie on popular queries with fuzzy search via BK-tree or Symmetric Delete. Elasticsearch with completion field mapping is a ready solution with fuzzy matching out of the box:
{
"mappings": {
"properties": {
"suggest": {
"type": "completion",
"analyzer": "standard",
"contexts": [
{"name": "category", "type": "category"}
]
},
"weight": {"type": "integer"}
}
}
}
# Search autocomplete via ES Completion Suggester
async def get_suggestions(prefix: str, category: str, user_id: str) -> list[str]:
response = await es.search(
index="search_suggestions",
body={
"suggest": {
"query_suggest": {
"prefix": prefix,
"completion": {
"field": "suggest",
"size": 8,
"fuzzy": {"fuzziness": "AUTO"},
"contexts": {"category": [category]}
}
}
}
}
)
return [hit["_source"]["query"] for hit in response["suggest"]["query_suggest"][0]["options"]]
Personalized suggestion ranker
Base suggestions from ES are reranked using user history. Ranker features:
-
global_frequency— how often all users entered this query -
user_query_history_match— did this user enter a similar query before -
user_category_affinity— how close the query category is to user's interests -
recency_boost— trending queries in the last 24 hours get a boost
On-device cache for instant response
First 3–5 characters cover ~80% of popular prefix combinations. Cache suggestions for them on device at app startup (or in background):
// Android: preload popular prefix suggestions
class AutocompleteCache(context: Context) {
private val db = Room.databaseBuilder(context, AutocompleteDatabase::class.java, "autocomplete").build()
suspend fun preload() {
val popularPrefixes = autocompleteApi.getPopularPrefixes(limit = 500)
db.suggestionDao().insertAll(popularPrefixes)
}
suspend fun getSuggestions(prefix: String): List<String> {
// check local cache first
val cached = db.suggestionDao().getSuggestions(prefix)
if (cached.isNotEmpty()) return cached
// if not cached, request from server
return autocompleteApi.getSuggestions(prefix)
}
}
Debounce and cancellation on client
Each character shouldn't trigger a new request. Debounce 150–200 ms + cancel previous in-flight request:
// iOS: debounced autocomplete with cancellation
class SearchViewModel: ObservableObject {
@Published var suggestions: [String] = []
private var searchTask: Task<Void, Never>?
func onQueryChanged(_ query: String) {
searchTask?.cancel()
guard query.count >= 2 else { suggestions = []; return }
searchTask = Task {
try? await Task.sleep(nanoseconds: 150_000_000) // 150ms debounce
guard !Task.isCancelled else { return }
let results = try? await autocompleteService.getSuggestions(query)
await MainActor.run {
suggestions = results ?? []
}
}
}
}
Task.isCancelled is checked after debounce—if the user continues typing, the previous task is already cancelled.
Log suggestion selection
When user taps a suggestion, log: position in list, prefix at which it was selected, final query. This data trains the next ranker version.
Process
Analyze search logs: top 1000 queries, typo patterns, language/transliteration.
Set up Elasticsearch Completion Suggester with fuzzy matching.
Develop personalized suggestion ranker.
Implement on-device cache + debounce logic on iOS/Android.
Timeline estimates
ES Completion Suggester without personalization—2–3 days. With personalized ranker and on-device cache—1.5–2 weeks.







