AI virtual clothing try-on in mobile app

NOVASOLUTIONS.TECHNOLOGY is engaged in the development, support and maintenance of iOS, Android, PWA mobile applications. We have extensive experience and expertise in publishing mobile applications in popular markets like Google Play, App Store, Amazon, AppGallery and others.
Development and support of all types of mobile applications:
Information and entertainment mobile applications
News apps, games, reference guides, online catalogs, weather apps, fitness and health apps, travel apps, educational apps, social networks and messengers, quizzes, blogs and podcasts, forums, aggregators
E-commerce mobile applications
Online stores, B2B apps, marketplaces, online exchanges, cashback services, exchanges, dropshipping platforms, loyalty programs, food and goods delivery, payment systems.
Business process management mobile applications
CRM systems, ERP systems, project management, sales team tools, financial management, production management, logistics and delivery management, HR management, data monitoring systems
Electronic services mobile applications
Classified ads platforms, online schools, online cinemas, electronic service platforms, cashback platforms, video hosting, thematic portals, online booking and scheduling platforms, online trading platforms

These are just some of the types of mobile applications we work with, and each of them may have its own specific features and functionality, tailored to the specific needs and goals of the client.

Showing 1 of 1 servicesAll 1735 services
AI virtual clothing try-on in mobile app
Complex
~2-4 weeks
FAQ
Our competencies:
Development stages
Latest works
  • image_mobile-applications_feedme_467_0.webp
    Development of a mobile application for FEEDME
    761
  • image_mobile-applications_xoomer_471_0.webp
    Development of a mobile application for XOOMER
    649
  • image_mobile-applications_rhl_428_0.webp
    Development of a mobile application for RHL
    1071
  • image_mobile-applications_zippy_411_0.webp
    Development of a mobile application for ZIPPY
    947
  • image_mobile-applications_affhome_429_0.webp
    Development of a mobile application for Affhome
    884
  • image_mobile-applications_flavors_409_0.webp
    Development of a mobile application for the FLAVORS company
    466

AI Virtual Clothing Try-On in Mobile Apps

Virtual try-on is overlaying clothing on a photo or live camera feed. Sounds simple, but technically includes body segmentation, pose estimation, clothing texture deformation under person's anatomy, and realistic lighting. None of these parts are trivial alone.

Two Modes: Photo and Real-Time AR

Photo try-on—user uploads their photo, selects clothing, gets result in seconds. Higher quality, heavier models, server inference justified.

Real-time AR—camera live, clothing "worn" right on preview. Strict 30+ fps budget. On-device only light models or mesh-based via pose estimation.

Photo Try-On Architecture

Pipeline has four stages:

  1. Human parsing—segment body parts (upper, lower, sleeves, collar, background). Models: Self-Correction Human Parsing (SCHP), CDGNet.
  2. Pose estimation—17–33 body keypoints. MediaPipe Pose on-device, OpenPose on server.
  3. Warping—deform clothing image under pose and body shape. TPS (Thin Plate Spline) warping based on keypoint correspondences.
  4. Try-on synthesis—final generation accounting for shadows, wrinkles, lighting. Models: VITON-HD, HR-VITON, LaDI-VTON.

On mobile stages 1 and 2—on-device (MediaPipe), stages 3–4—on server.

On-Device: Pose and Parsing via MediaPipe

// MediaPipe Pose Landmarker
let options = PoseLandmarkerOptions()
options.baseOptions.modelAssetPath = Bundle.main.path(forResource: "pose_landmarker_full", ofType: "task")!
options.numPoses = 1
options.minPoseDetectionConfidence = 0.5
options.minPosePresenceConfidence = 0.5
options.minTrackingConfidence = 0.5

let poseLandmarker = try PoseLandmarker(options: options)

// From photo
let mpImage = try MPImage(uiImage: sourcePhoto)
let result = try poseLandmarker.detect(image: mpImage)

// result.landmarks[0]—array of 33 NormalizedLandmark
// Key points: LEFT_SHOULDER (11), RIGHT_SHOULDER (12), LEFT_HIP (23), RIGHT_HIP (24)

Human parsing on-device via Core ML/TFLite converted SCHP model. ~15 MB after quantization. On iPhone 13—300–500 ms for 512×512 image.

// Android: human parsing via TFLite
val interpreter = Interpreter(
    FileUtil.loadMappedFile(context, "schp_parsing.tflite"),
    Interpreter.Options().apply { addDelegate(GpuDelegate()) }
)

val input = Array(1) { Array(512) { Array(512) { FloatArray(3) } } }
val output = Array(1) { Array(512) { Array(512) { FloatArray(20) } } }  // 20 classes

interpreter.run(input, output)
// output[0][y][x]—probability vector for each class (upper-body, lower-body, etc.)

Server Try-On: HR-VITON

HR-VITON—state-of-the-art for photo try-on, works up to 1024×768. Takes: person photo + clothing photo (white background or with mask) + human parsing mask + pose.

Server-side API (FastAPI + PyTorch):

@app.post("/tryon")
async def virtual_tryon(
    person_image: UploadFile,
    clothing_image: UploadFile
):
    person = load_image(await person_image.read())
    clothing = load_image(await clothing_image.read())

    # Parsing and pose—pre-computed or compute here
    parse_map = run_human_parsing(person)
    keypoints = run_pose_estimation(person)

    # HR-VITON inference
    result = hrviton_model(person, clothing, parse_map, keypoints)

    return StreamingResponse(image_to_bytes(result), media_type="image/jpeg")

Generation time on A10 GPU—1.5–3 seconds. On CPU (testing)—15–30 seconds.

Real-Time AR: Mesh-Based Approach

For real-time without heavy GAN—simplified mesh warping:

  1. MediaPipe Pose real-time (30+ fps on-device).
  2. Build 2D body mesh from keypoints (triangles via Delaunay triangulation).
  3. Deform clothing texture on mesh via Metal.
// Metal vertex shader for clothing warping
vertex VertexOut clothingWarpVertex(
    uint vid [[vertex_id]],
    constant float2 *clothingUVs [[buffer(0)]],  // UV coords on source clothing
    constant float2 *bodyPositions [[buffer(1)]] // Positions on screen (from pose landmarks)
) {
    VertexOut out;
    out.position = float4(bodyPositions[vid], 0, 1);
    out.texCoord = clothingUVs[vid];
    return out;
}

Quality significantly lower than GAN approach: no realistic wrinkles, shadows, doesn't account for body volume. But runs 30 fps even on iPhone 11.

Clothing Catalog Management

Each catalog item needs special prep: white background photo, silhouette mask, category (upper, lower, dress). Content pipeline: upload → auto-segmentation via RemBG → mask validation → CDN storage.

Mobile app loads only preview images (compressed JPEG), full try-on data sent to server.

Process

Audit clothing catalog and quality requirements, choose architecture (photo vs AR), configure server pipeline with HR-VITON or alternative, on-device components (pose, parsing), try-on flow UI, result caching.

Timeline Estimates

Photo try-on with server inference, one platform takes 4–6 weeks. Full implementation with AR real-time, both platforms, catalog pipeline requires 10–16 weeks.