AI static photo animation in mobile app

NOVASOLUTIONS.TECHNOLOGY is engaged in the development, support and maintenance of iOS, Android, PWA mobile applications. We have extensive experience and expertise in publishing mobile applications in popular markets like Google Play, App Store, Amazon, AppGallery and others.
Development and support of all types of mobile applications:
Information and entertainment mobile applications
News apps, games, reference guides, online catalogs, weather apps, fitness and health apps, travel apps, educational apps, social networks and messengers, quizzes, blogs and podcasts, forums, aggregators
E-commerce mobile applications
Online stores, B2B apps, marketplaces, online exchanges, cashback services, exchanges, dropshipping platforms, loyalty programs, food and goods delivery, payment systems.
Business process management mobile applications
CRM systems, ERP systems, project management, sales team tools, financial management, production management, logistics and delivery management, HR management, data monitoring systems
Electronic services mobile applications
Classified ads platforms, online schools, online cinemas, electronic service platforms, cashback platforms, video hosting, thematic portals, online booking and scheduling platforms, online trading platforms

These are just some of the types of mobile applications we work with, and each of them may have its own specific features and functionality, tailored to the specific needs and goals of the client.

Showing 1 of 1 servicesAll 1735 services
AI static photo animation in mobile app
Complex
~1-2 weeks
FAQ
Our competencies:
Development stages
Latest works
  • image_mobile-applications_feedme_467_0.webp
    Development of a mobile application for FEEDME
    761
  • image_mobile-applications_xoomer_471_0.webp
    Development of a mobile application for XOOMER
    649
  • image_mobile-applications_rhl_428_0.webp
    Development of a mobile application for RHL
    1071
  • image_mobile-applications_zippy_411_0.webp
    Development of a mobile application for ZIPPY
    947
  • image_mobile-applications_affhome_429_0.webp
    Development of a mobile application for Affhome
    884
  • image_mobile-applications_flavors_409_0.webp
    Development of a mobile application for the FLAVORS company
    466

AI Photo Animation in Mobile Apps

"Bring to life" a static photo—synthesize motion where none exists. Eyes that blink. Head that slightly turns. Hair swaying from wind. This is a generative model task, and implementing it fully on-device in 2024 remains non-trivial.

Two Architectural Approaches

Server inference—model lives on backend. App uploads photo, receives video. Simpler to deploy, no model size constraints, can use SadTalker, LivePortrait, or AnimateDiff. Downside—needs internet, 3–15 second latency, GPU time cost.

On-device—lighter specialized models. Face reenactment via landmark-based warping (First Order Motion Model mobile version), or simple animation via optical flow. Offline, but lower quality.

Most implementations choose hybrid: on-device quick preview (low quality), server final result.

On-Device: Facial Animation via Keypoints

Lightweight approach without generative network: use MediaPipe Face Mesh (468 face points) to build mesh, then deform source image along given motion trajectory.

// MediaPipe FaceLandmarker on iOS
let options = FaceLandmarkerOptions()
options.baseOptions.modelAssetPath = Bundle.main.path(forResource: "face_landmarker", ofType: "task")!
options.numFaces = 1
options.minFaceDetectionConfidence = 0.5

let faceLandmarker = try FaceLandmarker(options: options)
let result = try faceLandmarker.detect(image: .init(uiImage: sourcePhoto))

// landmarks.first?.faceLandmarks—468 points [NormalizedLandmark]
// Build deformation via TPS (Thin Plate Spline) or affine warp

Animation—along pre-recorded head motion trajectory (mocap data) or synthetic: sinusoidal oscillations of keypoints with different amplitudes. Render deformed image via Metal Performance Shaders—few milliseconds per frame.

Result—3–5 seconds animation, exported to .mp4 via AVAssetWriter. Quality sufficient for "living portrait", but artifacts at face edges and background inevitable without full GAN.

First Order Motion Model (FOMM): Mobile Version

FOMM generates motion from one driving video (donor) and source image. On mobile runs via TFLite or ONNX Runtime, but model after optimization—40–80 MB. On iPhone 12+ one frame 256×256 inference—~200–400 ms. For 30-frame animation (1 second)—6–12 seconds processing. One-time generation, not real-time.

// Android: ONNX Runtime with FOMM
val session = OrtEnvironment.getEnvironment().createSession("fomm_optimized.onnx")

// Model inputs: source frame (1, 3, 256, 256) + driving frame (1, 3, 256, 256) + keypoints
val sourceInput = OnnxTensor.createTensor(env, sourceArray, longArrayOf(1, 3, 256, 256))
val drivingInput = OnnxTensor.createTensor(env, drivingArray, longArrayOf(1, 3, 256, 256))

val result = session.run(mapOf("source" to sourceInput, "driving" to drivingInput))
// Result: deformed source with applied motion

Loop over driving frames (pre-recorded motion clip): get sequence of output frames, assemble into video.

Server Option: SadTalker and LivePortrait

For quality facial animation with audio (talking head)—SadTalker: takes photo + audio track, generates video where face speaks in sync with speech. On server with A100—30–60 seconds per minute of video. App uploads photo and audio, gets mp4.

LivePortrait (2024)—faster and higher quality variant, 128 ms per frame on A100. API wrapper via FastAPI or Replicate.

// Upload photo to server for animation
func uploadPhotoForAnimation(image: UIImage, audio: URL?) async throws -> URL {
    var request = URLRequest(url: URL(string: "https://api.example.com/animate")!)
    request.httpMethod = "POST"
    // multipart/form-data: image + optional audio
    let boundary = UUID().uuidString
    let body = createMultipartBody(image: image, audio: audio, boundary: boundary)
    request.httpBody = body

    let (data, _) = try await URLSession.shared.data(for: request)
    let response = try JSONDecoder().decode(AnimationResponse.self, from: data)
    return response.videoURL
}

Task status polling or WebSocket notification for readiness—depends on generation time.

Export and Playback

Animation result—.mp4 (H.264 or H.265). On iOS plays via AVPlayer, exports to Photos via PHPhotoLibrary. For looped animation (Living Photo)—convert to .gif via CGImageDestination or LivePhoto format via PHLivePhoto.

Apple Live Photo: need both video file (.mov) and photo file (.jpg) with same kCGImagePropertyMakerAppleDictionary17 (identifier). Without this, system Photos app doesn't perceive as LivePhoto.

Process

Choose architecture (on-device vs server), prepare model or API integration, implement UI for choosing animation "style", export and sharing. For server variant—task queue, ready status, timeout fallback.

Timeline Estimates

On-device landmark-based animation, one platform takes 3–4 weeks. Server integration with SadTalker/LivePortrait + both platforms requires 4–7 weeks.