AI Image Generation (Stable Diffusion) for Mobile App

NOVASOLUTIONS.TECHNOLOGY is engaged in the development, support and maintenance of iOS, Android, PWA mobile applications. We have extensive experience and expertise in publishing mobile applications in popular markets like Google Play, App Store, Amazon, AppGallery and others.
Development and support of all types of mobile applications:
Information and entertainment mobile applications
News apps, games, reference guides, online catalogs, weather apps, fitness and health apps, travel apps, educational apps, social networks and messengers, quizzes, blogs and podcasts, forums, aggregators
E-commerce mobile applications
Online stores, B2B apps, marketplaces, online exchanges, cashback services, exchanges, dropshipping platforms, loyalty programs, food and goods delivery, payment systems.
Business process management mobile applications
CRM systems, ERP systems, project management, sales team tools, financial management, production management, logistics and delivery management, HR management, data monitoring systems
Electronic services mobile applications
Classified ads platforms, online schools, online cinemas, electronic service platforms, cashback platforms, video hosting, thematic portals, online booking and scheduling platforms, online trading platforms

These are just some of the types of mobile applications we work with, and each of them may have its own specific features and functionality, tailored to the specific needs and goals of the client.

Showing 1 of 1 servicesAll 1735 services
AI Image Generation (Stable Diffusion) for Mobile App
Medium
~3-5 business days
FAQ
Our competencies:
Development stages
Latest works
  • image_mobile-applications_feedme_467_0.webp
    Development of a mobile application for FEEDME
    756
  • image_mobile-applications_xoomer_471_0.webp
    Development of a mobile application for XOOMER
    624
  • image_mobile-applications_rhl_428_0.webp
    Development of a mobile application for RHL
    1054
  • image_mobile-applications_zippy_411_0.webp
    Development of a mobile application for ZIPPY
    947
  • image_mobile-applications_affhome_429_0.webp
    Development of a mobile application for Affhome
    862
  • image_mobile-applications_flavors_409_0.webp
    Development of a mobile application for the FLAVORS company
    445

Implementing AI Image Generation (Stable Diffusion) in a Mobile App

Stable Diffusion offers more control than DALL-E: negative prompts, ControlNet, LoRA, step tuning, CFG scale, SDXL vs SD 1.5. But this adds complexity: you must choose a provider (or self-host), understand parameters that directly impact quality, and properly organize async pipeline — generation takes 10–30 seconds.

Integration options

Replicate — cloud inference via REST API. Supports SDXL, SD 1.5, many LoRA. Async model: POST → get prediction_id → polling or webhook for result.

FAL.ai — faster than Replicate with lower latency, both sync and async modes, supports SDXL, SD3, Flux.

Stability AI API — official provider, reliable but pricier.

Self-hosting — ComfyUI or AUTOMATIC1111 on GPU server. Maximum control, no vendor lock-in, economical at scale.

For a mobile app with moderate load — Replicate or FAL, without infrastructure costs.

Replicate integration (SDXL)

Replicate uses async model. First create a prediction, then poll for status:

class ReplicateSDXLService {
    private let baseURL = "https://api.replicate.com/v1"
    private let modelVersion = "7762fd07cf82c948538e41f63f77d685e02b063e0ccecb39397596b78813f88f" // SDXL

    func generate(prompt: String, negativePrompt: String = "", steps: Int = 30) async throws -> URL {
        // 1. Create prediction
        let createBody: [String: Any] = [
            "version": modelVersion,
            "input": [
                "prompt": prompt,
                "negative_prompt": negativePrompt,
                "num_inference_steps": steps,
                "guidance_scale": 7.5,
                "width": 1024,
                "height": 1024
            ]
        ]

        var createRequest = URLRequest(url: URL(string: "\(baseURL)/predictions")!)
        createRequest.httpMethod = "POST"
        createRequest.setValue("Token \(apiKey)", forHTTPHeaderField: "Authorization")
        createRequest.setValue("application/json", forHTTPHeaderField: "Content-Type")
        createRequest.httpBody = try JSONSerialization.data(withJSONObject: createBody)

        let (createData, _) = try await URLSession.shared.data(for: createRequest)
        let prediction = try JSONDecoder().decode(Prediction.self, from: createData)

        // 2. Poll until complete
        return try await pollUntilComplete(predictionId: prediction.id)
    }

    private func pollUntilComplete(predictionId: String) async throws -> URL {
        var attempts = 0
        while attempts < 60 {
            try await Task.sleep(nanoseconds: 2_000_000_000) // 2 seconds
            let statusURL = URL(string: "\(baseURL)/predictions/\(predictionId)")!
            var request = URLRequest(url: statusURL)
            request.setValue("Token \(apiKey)", forHTTPHeaderField: "Authorization")

            let (data, _) = try await URLSession.shared.data(for: request)
            let status = try JSONDecoder().decode(PredictionStatus.self, from: data)

            switch status.status {
            case "succeeded":
                return URL(string: status.output![0])!
            case "failed":
                throw SDError.generationFailed(status.error ?? "Unknown error")
            default:
                attempts += 1
            }
        }
        throw SDError.timeout
    }
}

Instead of polling, use webhooks ("webhook": "https://your-backend.com/webhook"), but for mobile polling with 2-second intervals is simpler.

Parameters that actually impact results

num_inference_steps — number of diffusion steps. 20–30 for production (speed/quality balance). 50+ shows no noticeable improvement, just slower.

guidance_scale (CFG scale) — how strictly to follow the prompt. 7–8 for realistic images, 10–12 for stylized. >15 produces artifacts.

negative_prompt — what to exclude. Standard set: "blurry, low quality, distorted, deformed, ugly, duplicate, watermark". Not magic, but works.

For portraits: "((best quality)), detailed face, sharp focus" in positive + "bad anatomy, distorted face, extra fingers, mutation" in negative.

ControlNet for structure/pose-guided generation

ControlNet lets you specify image structure: body pose (OpenPose), edges (Canny), depth. This is a key difference from DALL-E:

let controlNetBody: [String: Any] = [
    "version": "...", // ControlNet SDXL version
    "input": [
        "prompt": prompt,
        "image": base64EncodedPoseImage, // OpenPose skeleton
        "controlnet_conditioning_scale": 0.8,
        "control_mode": "balanced"
    ]
]

User takes a photo or selects a pose → send it as control image → model generates character in that pose. Popular in fashion, fitness, avatar apps.

On-device: Core ML and ONNX

SDXL turbo versions (SDXL-Turbo, LCM) with 4–8 steps run on iPhone 15 Pro via Core ML in 10–15 seconds. Apple publishes converted Core ML SD models on Hugging Face.

// Core ML SD via Apple's swift-coreml-diffusers
let pipeline = try StableDiffusionPipeline(
    resourcesAt: modelURL,
    controlNet: [],
    configuration: config
)
let images = try pipeline.generateImages(
    prompt: prompt,
    imageCount: 1,
    stepCount: 4, // SDXL-Turbo: 4 steps sufficient
    seed: 42
)

Android — ONNX Runtime with SD mobile-optimized models (~400 MB). 20–40 seconds on average 2024 device. Realistic only for offline scenarios.

Timeline

Replicate SDXL integration with basic UI (prompt + result) — 3–5 days. ControlNet, LoRA selection, parameters (CFG, steps), generation history, sharing — 2–3 weeks.