Tech Support Assistant Bot in Mobile App
A support bot is one of the few cases where LLMs are economically justified without complex tuning. With structured knowledge, RAG (retrieval-augmented generation) delivers accurate answers with minimal hallucination. Without it, LLMs invent non-existent solutions with confident tone.
RAG as the Foundation of Support Bot
Classic flow: documentation, knowledge base articles, solved tickets → vector database → on user query, search relevant fragments → pass to LLM as context → model formulates answer based on real data.
A stack that works well for support:
Knowledge base (Confluence, Notion, MDX files)
↓ Chunking + Embedding (text-embedding-3-small / BGE-m3)
Qdrant / pgvector
↓ Semantic search (top-5 chunks)
GPT-4o mini / Claude 3.5 Haiku
↓ Answer generation with context
Mobile client
Critical: chunks must be semantic, not mechanical 500-character cuts. Splitting an article mid-paragraph loses relevant context. Use RecursiveCharacterTextSplitter with heading and paragraph separators.
Escalation to Live Agent
The bot shouldn't pretend to know everything. If confidence is below threshold (or the model explicitly says "I don't know") — escalate. The switch is transparent: user sees "Connecting specialist", dialog history passes to the agent.
Integrations for live chat handoff: Zendesk Chat API, Intercom, AmoCRM, custom ticketing. Technically — a webhook with dialog history and contact info.
Key pattern: bot continues working while waiting for agent. If queue is long, bot tries finding answers again, suggests helpful articles while user waits.
Ticket Classification and Prioritization
If the bot can't solve it, it classifies before handoff:
CATEGORY_PROMPT = """
Classify user's request into category:
- billing (payment questions, invoice, refund)
- technical (errors, feature not working)
- account (account, password, access)
- other
Return only category name, no explanation.
"""
async def classify_ticket(user_message: str) -> str:
response = await openai_client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": CATEGORY_PROMPT},
{"role": "user", "content": user_message}
],
temperature=0
)
return response.choices[0].message.content.strip()
Category determines queue and priority. temperature=0 is non-negotiable: classification needs determinism, not creativity.
Support-Specific UI Features
Attachments. Users want to attach screenshots and videos. On iOS — PHPickerViewController with media type restrictions; on Android — ActivityResultContracts.GetContent(). Files upload to S3/Cloudinary, preview displays in message.
Ticket status. Once a ticket is created, users see its number and can track via the same bot: "Status of my request #12345".
Answer rating. After each bot response — thumbs up/down. Data feeds analytics and helps refine the knowledge base.
Development Process
Audit knowledge base: volume, format, currency.
Pipeline setup: document parsing, chunking, embedding, vector database.
Developing system prompt with instructions not to go beyond the knowledge base.
Escalation logic and ticketing system integration.
Mobile client with attachment support and ticket status.
Timeline Estimates
Bot with RAG on ready knowledge base without ticketing integration — 1 week. Full bot with RAG, classification, agent escalation, analytics — 3–4 weeks.







