AI Camera AR Translation for Mobile App

NOVASOLUTIONS.TECHNOLOGY is engaged in the development, support and maintenance of iOS, Android, PWA mobile applications. We have extensive experience and expertise in publishing mobile applications in popular markets like Google Play, App Store, Amazon, AppGallery and others.

8+Years of workmore info 900+Completed projectsmore info 100+In house employeesmore info 19+Partnersmore info

Development and support of all types of mobile applications:

Information and entertainment mobile applications

News apps, games, reference guides, online catalogs, weather apps, fitness and health apps, travel apps, educational apps, social networks and messengers, quizzes, blogs and podcasts, forums, aggregators

E-commerce mobile applications

Online stores, B2B apps, marketplaces, online exchanges, cashback services, exchanges, dropshipping platforms, loyalty programs, food and goods delivery, payment systems.

Business process management mobile applications

CRM systems, ERP systems, project management, sales team tools, financial management, production management, logistics and delivery management, HR management, data monitoring systems

Electronic services mobile applications

Classified ads platforms, online schools, online cinemas, electronic service platforms, cashback platforms, video hosting, thematic portals, online booking and scheduling platforms, online trading platforms

These are just some of the types of mobile applications we work with, and each of them may have its own specific features and functionality, tailored to the specific needs and goals of the client.

Services we offer

Showing 1 of 1All 1735 services

AI Camera AR Translation for Mobile App

Complex

~1-2 weeks

Frequently Asked Questions

Our competencies:

Free consultation

Book a free consultation if you have any questions. A dedicated specialist will advise you.

Cost calculation

If you know what exactly you need to develop, or you already have a ready-made technical task.

Development stages

Latest works

Development of a mobile application for FEEDME
792
Development of a mobile application for XOOMER
671
Development of a mobile application for RHL
1097
Development of a mobile application for ZIPPY
969
Development of a mobile application for Affhome
914
Development of a mobile application for the FLAVORS company
495

Show more works

AI Camera Translation (AR Translation) in Mobile Apps

Google Translate's "Instant Translation" is AR Translation in action: camera sees text, translation appears in real-time overlaid on the image as if printed there natively. Implementing it independently is harder than it appears: OCR, translation, inpainting the background beneath erased source text, and rendering new text with correct font and size.

AR Translation Pipeline Architecture

Each camera frame passes through multiple stages:

Frame → Text Detection → OCR → Translation → Inpainting → Text Overlay → Render

Text Detection. Find text bounding boxes in frame. On iOS: VNRecognizeTextRequest (Vision framework) with recognitionLevel: .fast for real-time. On Android: ML Kit Text Recognition v2. Both work on-device, no network required. Vision framework returns VNTextObservation with bounding box in normalized coordinates — convert to screen coordinates accounting for buffer orientation.

OCR. VNRecognizeTextRequest with recognitionLevel: .accurate is too slow for every frame. Strategy: use .fast for detection, .accurate only when text stabilizes (user tap or stationary phone). Stable frame detection: compare bounding boxes between frames — if deviation < 5px → text is stable → run accurate OCR.

Translation. Two options:

	On-device (ML Kit Translate)	Cloud API (DeepL, Google Cloud)
Latency	10–50 ms	200–800 ms
Quality	Adequate	High (DeepL especially)
Offline	Yes (~30 MB model)	No
Cost	Free	Per request

For live camera stream — on-device only. For "photograph → translate" mode — cloud API with DeepL for better quality.

Inpainting and Text Overlay — Most Complex Part

Simple approach: draw background-colored rectangle over source text, write translation on top. Result — crude white rectangle, doesn't fit the image. Correct approach:

Background Color Detection. Sample pixels around bounding box, compute median color — fill rectangle with it. Works for uniform backgrounds (white wall, paper sheet).

Texture Inpainting for Complex Backgrounds. CoreImage CIInpaintingFilter (iOS 16+) or custom convolution kernel to fill region with background texture. For real-time — too slow, use only in static photo mode.

Font Matching. Determine source text size from bounding box, select UIFont / TextPaint with similar size. Identifying specific font from OCR result — unsolved for most cases. Use system sans-serif.

Right-to-Left (RTL) Languages. Arabic, Hebrew — text flows right-to-left, UILabel and TextView need semanticContentAttribute: .forceRightToLeft. When overlaying on image: NSParagraphStyle.writingDirection = .rightToLeft.

Stabilization and Performance

Running full pipeline every frame at 30 FPS is impossible. Throttling:

Text detection: every 3–5 frames
OCR: only on stabilization or tap
Translation: debounce 500 ms on text change

On iPhone 12+ Metal Performance Shaders accelerate Vision pipeline. On Android — GPU Delegate for ML Kit via GpuDelegateV2.

Cache results by OCR text hash: don't translate same text twice in session.

What's Included

Architecture selection: on-device vs cloud, livecam vs photo mode
OCR + translation pipeline implementation
UI for language selection (with source language auto-detection)
Translation text overlay on image
Offline mode with downloadable language models (ML Kit)

Timeline: basic AR translation for static photos — 3–5 weeks. Real-time livecam translation with on-device ML and offline mode — 6–10 weeks. Cost calculated individually.