AI 3D model generation from photo in mobile app

NOVASOLUTIONS.TECHNOLOGY is engaged in the development, support and maintenance of iOS, Android, PWA mobile applications. We have extensive experience and expertise in publishing mobile applications in popular markets like Google Play, App Store, Amazon, AppGallery and others.
Development and support of all types of mobile applications:
Information and entertainment mobile applications
News apps, games, reference guides, online catalogs, weather apps, fitness and health apps, travel apps, educational apps, social networks and messengers, quizzes, blogs and podcasts, forums, aggregators
E-commerce mobile applications
Online stores, B2B apps, marketplaces, online exchanges, cashback services, exchanges, dropshipping platforms, loyalty programs, food and goods delivery, payment systems.
Business process management mobile applications
CRM systems, ERP systems, project management, sales team tools, financial management, production management, logistics and delivery management, HR management, data monitoring systems
Electronic services mobile applications
Classified ads platforms, online schools, online cinemas, electronic service platforms, cashback platforms, video hosting, thematic portals, online booking and scheduling platforms, online trading platforms

These are just some of the types of mobile applications we work with, and each of them may have its own specific features and functionality, tailored to the specific needs and goals of the client.

Showing 1 of 1 servicesAll 1735 services
AI 3D model generation from photo in mobile app
Complex
~2-4 weeks
FAQ
Our competencies:
Development stages
Latest works
  • image_mobile-applications_feedme_467_0.webp
    Development of a mobile application for FEEDME
    761
  • image_mobile-applications_xoomer_471_0.webp
    Development of a mobile application for XOOMER
    649
  • image_mobile-applications_rhl_428_0.webp
    Development of a mobile application for RHL
    1071
  • image_mobile-applications_zippy_411_0.webp
    Development of a mobile application for ZIPPY
    947
  • image_mobile-applications_affhome_429_0.webp
    Development of a mobile application for Affhome
    884
  • image_mobile-applications_flavors_409_0.webp
    Development of a mobile application for the FLAVORS company
    466

AI 3D Model Generation from Photo in Mobile Apps

Generating a 3D object from one shot on mobile—one of the most resource-intensive tasks in mobile AI. Classical approaches require dozens of photos (photogrammetry) or special equipment (LiDAR). Neural network generation from single photo is real in 2024, but with significant quality limitations when fully on-device.

Architectural Variants

Fully on-device—lightweight models like DepthPro (Apple, 2024) for depth estimation + point cloud, or One-2-3-45 mobile edition. Get rough 3D structure, suitable for AR preview but not professional export.

Hybrid—on-device depth map and object segmentation, on server full 3D reconstruction via Zero123++, One-2-3-45, or TripoSR. Server returns .obj or .glb file.

LiDAR-augmented—iPhone 12 Pro+ and iPad Pro have LiDAR scanner. ARKit + ARMeshAnchor get real scene mesh. LiDAR mesh + camera texture + AI texture inpainting gives quality result without server.

On-Device: DepthPro for Initial Depth

Apple DepthPro (2024)—Foundation Model for metric depth estimation. Converts to Core ML:

let model = try DepthPro(configuration: MLModelConfiguration())

// Input image → depth map
let inputImage = try MLFeatureValue(cgImage: sourceImage.cgImage!, constraint: nil)
let prediction = try model.prediction(image: inputImage)

// prediction.depth—MLMultiArray with metric depth values (in meters)
let depthArray = prediction.depth  // shape [1, H, W]

Depth map → point cloud: for each pixel (x, y) with known depth Z compute 3D coordinate via pinhole camera model with focal length from EXIF. Get point cloud.

Visualize point cloud in AR via RealityKit and ModelEntity with custom MeshDescriptor:

var descriptor = MeshDescriptor(name: "pointCloud")
descriptor.positions = MeshBuffers.Positions(points)  // [SIMD3<Float>]
descriptor.primitives = .points(Array(0..<points.count))

let mesh = try MeshResource.generate(from: [descriptor])
let entity = ModelEntity(mesh: mesh, materials: [UnlitMaterial(color: .white)])

Not full 3D model with mesh, but point cloud—visually works for demo, needs meshing for export.

Meshing: Poisson or Marching Cubes

Point cloud → polygonal mesh via Poisson Surface Reconstruction algorithm. On mobile via Open3D (C++ library via Objective-C bridge) or custom Metal compute shaders. Poisson reconstruction needs normals at each point; estimate normals from local neighborhood via PCA.

Non-trivial on mobile: Open3D compiled for iOS/Android—~15 MB binary, requires C++17, runs background thread. Result—.obj file with mesh.

LiDAR Path: ARKit ARMeshAnchor

On iPhone with LiDAR, most reliable—ARKit:

let configuration = ARWorldTrackingConfiguration()
configuration.sceneReconstruction = .meshWithClassification

// In ARSession delegate
func session(_ session: ARSession, didUpdate anchors: [ARAnchor]) {
    for anchor in anchors.compactMap({ $0 as? ARMeshAnchor }) {
        let geometry = anchor.geometry
        // geometry.vertices, geometry.faces, geometry.normals—ready mesh
        exportMesh(geometry: geometry, transform: anchor.transform)
    }
}

ARMeshAnchor.geometry.verticesARGeometrySource with Metal buffer. Export to .obj:

func exportToOBJ(geometry: ARMeshGeometry, transform: simd_float4x4) -> String {
    var obj = ""
    let vertices = geometry.vertices
    // Iterate MTLBuffer directly via withUnsafeBytes
    vertices.buffer.contents().withMemoryRebound(to: SIMD3<Float>.self, capacity: vertices.count) { ptr in
        for i in 0..<vertices.count {
            let v = ptr[i]
            let world = transform * SIMD4<Float>(v.x, v.y, v.z, 1)
            obj += "v \(world.x) \(world.y) \(world.z)\n"
        }
    }
    // Similarly for faces (indices)
    return obj
}

Mesh texturing—project video frame onto mesh via UV-mapping. Separate task; without it mesh stays gray.

Server Generation: TripoSR and Zero123++

For high quality without LiDAR—server pipeline. TripoSR (Stability AI, 2024): takes one photo, generates .obj in 0.5–1 second on A10. API:

func generateModel(from image: UIImage) async throws -> URL {
    let imageData = image.jpegData(compressionQuality: 0.9)!
    var request = URLRequest(url: URL(string: "https://api.example.com/triposr")!)
    request.httpMethod = "POST"
    request.setValue("multipart/form-data; boundary=\(boundary)", forHTTPHeaderField: "Content-Type")
    // ... upload + poll
}

Result—.glb file, load in RealityKit via Entity.loadModel(named:) or via ModelEntity(mesh: try .loadModel(contentsOf: url)).

AR Preview of Result

Any variant ends same: show 3D object in AR via RealityKit/ARSCNView. User can "place" object on real surface, rotate, scale. Covers scenario "see how furniture looks in room" or "show product in AR".

Export: .usdz for iOS (native Apple format, supports AR Quick Look), .glb for Android and web.

Process

Choose architecture per task (LiDAR/on-device depth/server), implement capture and processing pipeline, AR preview, export to required formats. Separate—test on complex objects: glass surfaces, thin details, monotone colors.

Timeline Estimates

LiDAR-based scanning with iOS export takes 3–5 weeks. Full pipeline with on-device depth + server reconstruction + AR preview on both platforms requires 8–14 weeks.