Implementing an AI Bot for IoT Device Monitoring in a Mobile App
IoT monitoring is no longer just a dashboard with graphs. The question "why did sensor 7 temperature rise by 3 degrees" once required opening a web panel, finding the sensor, building a historical graph, and correlating with event logs. An AI bot in a mobile app answers this question in chat — pulling data from the necessary sources automatically.
Architecture: LLM + Function Calling + IoT API
The key mechanism is Function Calling (Tool Use) in OpenAI GPT-4o or Anthropic Claude 3.5 Sonnet. The model doesn't have direct access to sensor data but can invoke functions via API. The mobile app is the client that receives requests from the model, calls the IoT backend, and returns results.
// Android: handling tool_calls from GPT-4o
data class ChatMessage(
val role: String, // user, assistant, tool
val content: String? = null,
val toolCalls: List<ToolCall>? = null,
val toolCallId: String? = null,
val name: String? = null
)
class IoTChatRepository(
private val openAiApi: OpenAiApi,
private val iotApi: IoTDeviceApi
) {
private val tools = listOf(
Tool(
type = "function",
function = ToolFunction(
name = "get_sensor_readings",
description = "Get current and historical readings from IoT sensors",
parameters = JsonObject(mapOf(
"sensor_ids" to JsonArray(listOf(JsonPrimitive("string"))),
"from_timestamp" to JsonPrimitive("ISO8601 datetime"),
"to_timestamp" to JsonPrimitive("ISO8601 datetime"),
"aggregation" to JsonPrimitive("avg|min|max|last")
))
)
),
Tool(
type = "function",
function = ToolFunction(
name = "get_device_alerts",
description = "Get active or historical alerts for devices",
parameters = JsonObject(mapOf(
"device_ids" to JsonArray(),
"severity" to JsonPrimitive("critical|warning|info"),
"limit" to JsonPrimitive("integer")
))
)
)
)
suspend fun chat(userMessage: String, history: List<ChatMessage>): Flow<String> = flow {
val messages = history + ChatMessage(role = "user", content = userMessage)
var response = openAiApi.chatCompletion(messages, tools)
// Tool call execution loop
while (response.toolCalls != null) {
val toolResults = response.toolCalls!!.map { call ->
val result = when (call.function.name) {
"get_sensor_readings" -> iotApi.getSensorReadings(call.function.arguments)
"get_device_alerts" -> iotApi.getAlerts(call.function.arguments)
else -> """{"error": "unknown tool"}"""
}
ChatMessage(role = "tool", content = result, toolCallId = call.id, name = call.function.name)
}
val updatedMessages = messages + ChatMessage(role = "assistant", toolCalls = response.toolCalls) + toolResults
response = openAiApi.chatCompletion(updatedMessages, tools)
}
emit(response.content ?: "")
}
}
Response Streaming and Chat UX
GPT-4o supports Server-Sent Events for streaming. In Retrofit — @Streaming annotation + parsing text/event-stream. Characters appear as generated — no waiting for full response. On iOS — URLSession.AsyncBytes.
// iOS: streaming from OpenAI SSE
func streamResponse(messages: [ChatMessage]) -> AsyncThrowingStream<String, Error> {
AsyncThrowingStream { continuation in
Task {
var request = URLRequest(url: URL(string: "https://api.openai.com/v1/chat/completions")!)
request.httpMethod = "POST"
request.setValue("Bearer \(apiKey)", forHTTPHeaderField: "Authorization")
request.httpBody = try JSONEncoder().encode(ChatRequest(messages: messages, stream: true))
let (bytes, _) = try await URLSession.shared.bytes(for: request)
for try await line in bytes.lines {
guard line.hasPrefix("data: "), line != "data: [DONE]" else { continue }
let json = line.dropFirst(6)
if let chunk = try? JSONDecoder().decode(StreamChunk.self, from: Data(json.utf8)),
let delta = chunk.choices.first?.delta.content {
continuation.yield(delta)
}
}
continuation.finish()
}
}
}
Context and Security
System prompt provides context: user's device list with names and IDs, timezone, units. This allows the bot to understand "sensor in the boiler room" without explicit IDs.
Important: IoT API functions are called on behalf of the current user with their access rights. The bot cannot retrieve data from devices the user doesn't have access to — authorization at backend level, not prompt level.
Chat history — last 20–30 messages in context. Older messages are compressed via summarization: gpt-4o-mini with prompt "Summarize this conversation history briefly" — token savings.
Local Model as Fallback
For offline scenarios or cost reduction — llama.cpp with Phi-3 Mini or Mistral 7B model via android-llamacpp or LLM.swift. Local model doesn't work with function calling fully but answers basic questions from cached data.
Developing an AI bot for IoT monitoring with Function Calling, streaming, and IoT API integration: 4–6 weeks on top of existing IoT app. Pricing is calculated individually.







