Voice Control of IoT Devices via Mobile App

NOVASOLUTIONS.TECHNOLOGY is engaged in the development, support and maintenance of iOS, Android, PWA mobile applications. We have extensive experience and expertise in publishing mobile applications in popular markets like Google Play, App Store, Amazon, AppGallery and others.
Development and support of all types of mobile applications:
Information and entertainment mobile applications
News apps, games, reference guides, online catalogs, weather apps, fitness and health apps, travel apps, educational apps, social networks and messengers, quizzes, blogs and podcasts, forums, aggregators
E-commerce mobile applications
Online stores, B2B apps, marketplaces, online exchanges, cashback services, exchanges, dropshipping platforms, loyalty programs, food and goods delivery, payment systems.
Business process management mobile applications
CRM systems, ERP systems, project management, sales team tools, financial management, production management, logistics and delivery management, HR management, data monitoring systems
Electronic services mobile applications
Classified ads platforms, online schools, online cinemas, electronic service platforms, cashback platforms, video hosting, thematic portals, online booking and scheduling platforms, online trading platforms

These are just some of the types of mobile applications we work with, and each of them may have its own specific features and functionality, tailored to the specific needs and goals of the client.

Showing 1 of 1 servicesAll 1735 services
Voice Control of IoT Devices via Mobile App
Complex
~3-5 business days
FAQ
Our competencies:
Development stages
Latest works
  • image_mobile-applications_feedme_467_0.webp
    Development of a mobile application for FEEDME
    756
  • image_mobile-applications_xoomer_471_0.webp
    Development of a mobile application for XOOMER
    624
  • image_mobile-applications_rhl_428_0.webp
    Development of a mobile application for RHL
    1052
  • image_mobile-applications_zippy_411_0.webp
    Development of a mobile application for ZIPPY
    947
  • image_mobile-applications_affhome_429_0.webp
    Development of a mobile application for Affhome
    862
  • image_mobile-applications_flavors_409_0.webp
    Development of a mobile application for the FLAVORS company
    445

Voice Control of IoT Devices via Mobile Application

Voice control of IoT in mobile app — not just "add SiriKit" or "integrate Google Assistant". It's a separate logic layer: speech recognition, intent extraction, mapping to device commands, feedback. Each step breaks differently.

Two Fundamentally Different Approaches

Built-in voice assistants (Siri Shortcuts, Google Assistant Actions) work through cloud and require explicit permission. Siri Shortcuts on iOS available via INPlayMediaIntent and INSendMessageIntent, but for arbitrary IoT commands need AppIntent (iOS 16+) — Swift framework describing intents. Example: "Hey Siri, turn off kitchen light" → Siri calls TurnOffLightIntent in your app, which sends MQTT command. Delay — 2–4 seconds through Apple cloud, no guarantees with offline.

Local recognition — different level. On iOS this is SFSpeechRecognizer with SFSpeechAudioBufferRecognitionRequest. From iOS 13 supports on-device mode (requiresOnDeviceRecognition = true) without sending audio to cloud. On Android — SpeechRecognizer API (via Google cloud) or Vosk / Whisper.cpp for fully offline.

For IoT apps where local network operation without internet matters, choice is clear — local recognition + offline NLU.

NLU: From Text to Device Command

Recognized "turn on kitchen light and raise temperature to twenty two" — now extract:

  • intent: turn_on, set_temperature
  • entities: device_type=light, location=kitchen, device_type=thermostat, value=22

For simple cases, rule-based approach suffices: verb-intent dictionary + device/room dictionary from user's database. Build regex or simple intent matcher against existing device list.

For complex scenarios — Rasa NLU (self-hosted) or Duckling for numeric values. On Flutter integrate via HTTP request to local server in home network or via dart:ffi for embedded model.

Real example: smart apartment project, 35 devices, Russian language. Trained simple fastText model with ~500 command examples, converted to .tflite, ran via tflite_flutter. Accuracy on household commands — 94%. Misses were on compound commands (two actions in one phrase) — solved with preprocessing via splitting by conjunctions "and", "then", "later".

Feedback and Edge Cases

Push to talk vs always-on. Always-on on mobile — battery killer. Recommend push-to-talk button in app + optional wake word via Porcupine SDK (PicoVoice). Porcupine works locally, consumes <5% CPU on idle.

What if device not recognized? Don't stay silent. Return voice response via AVSpeechSynthesizer (iOS) / TextToSpeech (Android), list what was understood, ask to clarify. User doesn't see screen — needs audio feedback.

On Flutter use flutter_tts for synthesis and speech_to_text as unified API over platform engines. Important: on Android 11+ SpeechRecognizer requires RECORD_AUDIO permission with explicit rationale in onRequestPermissionsResult. Without clear rationale — Google Play marks as policy violation.

MQTT Integration

Voice command → NLU → device command → MQTT topic publish. Latency from button press to device response: on-device recognition ~300–800ms, NLU ~50ms, MQTT publish < 50ms on local broker. Total — feels instant.

With cloud recognition add 1.5–3 seconds. On Russian, cloud Google Speech-to-Text works well, Apple Speech — worse on IoT-specific terms like "dimmer", "receiver", "relay".

Timeline

Push-to-talk with cloud recognition and simple command mapping — 2–3 weeks. Offline recognition + NLU + wake word + TTS feedback — 6–10 weeks. Pricing depends on languages, platforms, and offline requirements.