Camera Document Scanning in Mobile App

NOVASOLUTIONS.TECHNOLOGY is engaged in the development, support and maintenance of iOS, Android, PWA mobile applications. We have extensive experience and expertise in publishing mobile applications in popular markets like Google Play, App Store, Amazon, AppGallery and others.
Development and support of all types of mobile applications:
Information and entertainment mobile applications
News apps, games, reference guides, online catalogs, weather apps, fitness and health apps, travel apps, educational apps, social networks and messengers, quizzes, blogs and podcasts, forums, aggregators
E-commerce mobile applications
Online stores, B2B apps, marketplaces, online exchanges, cashback services, exchanges, dropshipping platforms, loyalty programs, food and goods delivery, payment systems.
Business process management mobile applications
CRM systems, ERP systems, project management, sales team tools, financial management, production management, logistics and delivery management, HR management, data monitoring systems
Electronic services mobile applications
Classified ads platforms, online schools, online cinemas, electronic service platforms, cashback platforms, video hosting, thematic portals, online booking and scheduling platforms, online trading platforms

These are just some of the types of mobile applications we work with, and each of them may have its own specific features and functionality, tailored to the specific needs and goals of the client.

Showing 1 of 1 servicesAll 1735 services
Camera Document Scanning in Mobile App
Medium
~3-5 business days
FAQ
Our competencies:
Development stages
Latest works
  • image_mobile-applications_feedme_467_0.webp
    Development of a mobile application for FEEDME
    756
  • image_mobile-applications_xoomer_471_0.webp
    Development of a mobile application for XOOMER
    624
  • image_mobile-applications_rhl_428_0.webp
    Development of a mobile application for RHL
    1054
  • image_mobile-applications_zippy_411_0.webp
    Development of a mobile application for ZIPPY
    947
  • image_mobile-applications_affhome_429_0.webp
    Development of a mobile application for Affhome
    862
  • image_mobile-applications_flavors_409_0.webp
    Development of a mobile application for the FLAVORS company
    445

Implementing Document Scanning via Mobile App Camera

Users hold their phone over a contract, the app automatically detects the edges, corrects perspective, and delivers a clean PDF. This isn't just "take a photo and crop"—it involves edge detection, homographic transformation, and post-processing. Each step has failure points.

Common Breaking Points

Edge Detection Fails on Glare and Shadows

VNDetectRectanglesRequest (iOS Vision) returns VNRectangleObservation with four corner points in normalized coordinates. The problem: on glossy paper under direct light, the algorithm confuses glare with the paper edge. Solution—before detection, apply CIFilter with CIColorControls (reduce inputSaturation) and CIHighlightShadowAdjust. This removes glare as a color artifact.

On Android, Vision API (com.google.android.gms:play-services-mlkit-document-scanner) handles shadows better but requires Google Play Services. Alternative without GMS dependency—OpenCV findContours + approxPolyDP with filtering by area and aspect ratio. Threshold minArea = 30% of frame area removes background objects.

Perspective Correction

After obtaining four points, apply perspective transform. iOS: CIPerspectiveCorrection with explicit inputTopLeft, inputTopRight, inputBottomLeft, inputBottomRight in image coordinates (not preview coordinates). Common mistake—using preview layer coordinates directly without conversion via VNImagePointForNormalizedPoint.

Android: getPerspectiveTransform + warpPerspective from OpenCV, or matrix transformation via android.graphics.Matrix.setPolyToPoly. The second option works without OpenCV but is limited to affine transformations—not suitable for heavy perspective distortion.

Flutter: use cunning_document_scanner package (wrapper over native SDKs) or implement manually via image + manual homography calculation in Dart. The latter is tedious; native channel is preferable.

Post-Processing: Readability Over Aesthetics

After straightening, process the document for readability when printing or OCR:

  • Adaptive binarizationcv::adaptiveThreshold with Gaussian method outperforms Otsu on documents with uneven lighting
  • Deskew—if the document is rotated 1–2° after transformation, Hough Lines detect text line inclination and correct it
  • SharpnessCISharpenLuminance (iOS) or Sharpness filter (Android) with moderate value (0.4–0.6), not excessive

Offer color modes to the user: "Auto", "Document" (black and white), "Photo" (full color). In "Document" mode—apply binarization. In "Auto"—analyze histogram: if the document contains <5% saturated pixels, apply monochrome processing.

Multi-Page Scanning and PDF

Collect UIImage[] / Bitmap[], export via PDFKit (iOS 11+) or android.graphics.pdf.PdfDocument. On Flutter—use pdf package (pub.dev). Optimize PDF size: JPEG compression at 85% is sufficient for readability, with an A4 page taking ~150–250 KB instead of 2–4 MB for PNG.

Real-time preview: display the outline over AVCaptureVideoPreviewLayer / PreviewView via CAShapeLayer / SurfaceView. Update the outline every 3–5 frames (not every frame!)—otherwise the detector consumes CPU and preview stutters.

Our Workflow

Requirements audit → SDK selection (native Vision/ML Kit vs OpenCV) → preview integration with overlay → correction + post-processing → PDF export → testing on 10+ document types (passport, contract, receipt, book spread).

Timeline: 3–5 working days depending on platform and quality requirements. If OCR recognition integration is needed—add 2–3 days.