Predictive Text Input Implementation in Mobile Applications
Predictive text isn't just word autocomplete. In mobile app context it can be form autofill from history, predicting next user action, smart search suggestions or smart-compose in chat. Implementation depends on what exactly needs predicting.
Platform Built-in APIs
Fastest path — use what's already in OS.
On iOS UITextInputTraits and UITextField.autocorrectionType provide basic correction. For word prediction in custom keyboard — UILexicon and UITextDocumentProxy.documentContextBeforeInput. Apple doesn't expose its predictive model directly to developers, but KeyboardExtension gets document context access.
NSSpellChecker on iOS 16+ can work with checkedString(with:range:types:options:inSpellDocumentWithTag:orthography:wordCount:) — returns context-aware replacement suggestions.
On Android TextServicesManager and SpellCheckerSession provide similar access. InputMethodService for custom IMEs. SuggestionSpan shows suggestions inline.
Custom Predictor on TFLite
When platform APIs don't fit (domain-specific vocabulary, corporate jargon, non-standard context), need own model.
Architecture for next-word prediction: LSTM or small Transformer (GPT-2 small, 117M params in fp16 = ~240 MB). For mobile — quantized to int8, ~60 MB. On iPhone 14+ inference < 50 ms per prediction.
class PredictiveTextEngine {
private var interpreter: Interpreter
private let tokenizer: WordpieceTokenizer
private let vocabSize = 30522
func predict(context: String, topK: Int = 3) -> [WordSuggestion] {
let tokens = tokenizer.encode(context.suffix(128)) // last 128 tokens
var inputTensor = tokens.map { Int32($0) }
// Input → get logits over vocabulary
try interpreter.copy(&inputTensor, toInputAt: 0)
try interpreter.invoke()
let outputTensor = try interpreter.output(at: 0)
let logits = outputTensor.data.withUnsafeBytes {
Array(UnsafeBufferPointer<Float>(
start: $0.baseAddress!.assumingMemoryBound(to: Float.self),
count: vocabSize
))
}
return topKIndices(logits, k: topK).map { idx in
WordSuggestion(word: tokenizer.decode(idx), score: logits[idx])
}
}
}
Tokenizer is separate task. WordPiece works well for English, for Russian use SentencePiece with BPE model trained on Russian corpus. Vocabulary size impacts speed: 32k tokens vs 64k.
Search Autocomplete: Trie vs ML
For fixed catalog search (products, cities, users) — Trie on client is faster and more predictable than ML. Prefix match in O(k) where k is query length. SQLite FTS5 (MATCH "query*") — good for catalogs up to 1M items with fuzzy matching via spellfix1 extension.
ML needed where ranking by personal relevance matters, not just text matching.
Caching and Latency
Predictive input must respond under 100 ms — otherwise user already typed next character. Cache last N predictions in memory, invalidated on context change. On iOS DispatchQueue with qos: .userInteractive, on Android Dispatchers.Main.immediate in coroutine.
Debounce: don't call predictor on every character, wait 150–200 ms after last input. Reduces load without UX impact.
Implementation Process
Analyze domain: what text, what context, need personalization? Choose approach: platform APIs, Trie + FTS, or ML model. Prepare training data (if custom model). Quantize and optimize for mobile inference. Integrate in UI with proper debounce and cache. Test on various device classes.
Timeline Guidelines
Search autocomplete via Trie/FTS — 2–4 days. Custom ML model for next-word prediction with quantization and CoreML/TFLite inference — 3–5 weeks (including data prep).







