AI-Powered Location-Based Photo Grouping for Mobile Apps
Location-based grouping seems simple — every photo has GPS. But the real task is harder: coordinates scattered 10–50 meters need to be merged into a "place" (café, park, beach), different trips to the same place need to be separated by time, and thousands of photos without GPS need to be assigned to a location by context.
Step 1: CLLocation Clustering
Most modern smartphone photos contain GPS in EXIF. On iOS, read via PHAsset.location:
let fetchOptions = PHFetchOptions()
let photos = PHAsset.fetchAssets(with: .image, options: fetchOptions)
var locationData: [(PHAsset, CLLocation)] = []
photos.enumerateObjects { asset, _, _ in
if let location = asset.location {
locationData.append((asset, location))
}
}
For Android: ExifInterface with TAG_GPS_LATITUDE / TAG_GPS_LONGITUDE, or MediaStore MediaColumns.LATITUDE (deprecated in API 29+, now via ContentResolver query with MediaStore.Images.Media.LATITUDE).
Cluster GPS coordinates again using DBSCAN, but in metric space (haversine distance):
func haversineDistance(_ a: CLLocationCoordinate2D, _ b: CLLocationCoordinate2D) -> Double {
let R = 6371000.0 // meters
let dLat = (b.latitude - a.latitude) * .pi / 180
let dLon = (b.longitude - a.longitude) * .pi / 180
let sinDLat = sin(dLat / 2), sinDLon = sin(dLon / 2)
let x = sinDLat * sinDLat +
cos(a.latitude * .pi / 180) * cos(b.latitude * .pi / 180) * sinDLon * sinDLon
return R * 2 * atan2(sqrt(x), sqrt(1 - x))
}
Cluster radius eps = 200 meters works well for urban shooting. For tourist trips, 500–1000 meters is better.
Splitting by Time: One Place, Multiple Visits
User visited Barcelona three times. GPS cluster is one, but visits are different. Split by time gaps within a cluster:
func splitByTimeGap(assets: [PHAsset], maxGapHours: Double = 12) -> [[PHAsset]] {
let sorted = assets.sorted { $0.creationDate! < $1.creationDate! }
var groups: [[PHAsset]] = [[sorted[0]]]
for i in 1..<sorted.count {
let gap = sorted[i].creationDate!.timeIntervalSince(sorted[i-1].creationDate!) / 3600
if gap > maxGapHours {
groups.append([sorted[i]])
} else {
groups[groups.count - 1].append(sorted[i])
}
}
return groups
}
12-hour gap is default. Can adapt: within one city, 6 hours is enough; for multi-day trip, 24 hours.
Reverse Geocoding: Coordinates → Place Name
We have a cluster with centroid (lat, lon). Need human-readable name: "Barcelona, Spain" or "Central Park, New York".
Apple MapKit — CLGeocoder().reverseGeocodeLocation(). Free but rate-limited for batch requests. Don't exceed 1 request per second.
Google Places API — paid but returns venue name (café, hotel), not just street:
func reverseGeocode(coordinate: CLLocationCoordinate2D) async throws -> PlaceName {
let location = CLLocation(latitude: coordinate.latitude, longitude: coordinate.longitude)
let placemarks = try await CLGeocoder().reverseGeocodeLocation(location)
guard let placemark = placemarks.first else { throw GeoError.noResult }
return PlaceName(
city: placemark.locality,
country: placemark.country,
name: placemark.name
)
}
Cache geocoding results — same coordinates can appear in hundreds of photos. Dictionary [String: PlaceName] keyed by "\(lat.1)_\(lon.1)" (rounding to 1 decimal ≈ 11 km, enough for city-level grouping).
Photos Without GPS: Visual Classification
15–30% of photos lack GPS (old photos, indoor shooting with location disabled). For them, use Vision VNClassifyImageRequest + dictionary of categories with geosemantic meaning.
If photo classified as "beach", "mountain", "cityscape" — show in "Nature" or "Cities" section without specific address. If there's timestamp and nearby photos with GPS, assign to nearest cluster by time (+/- 1 hour).
UI: Display
Two patterns for UI:
- Map-based: MKMapView with cluster pins. Tap pin — gallery of location
- List-based: sort by date, sections = locations. Like Apple Photos "Memories"
For map-based: MKClusterAnnotation on iOS natively supports cluster pinning on zoom-out. On Android — Google Maps SDK ClusterManager.
Timelines
GPS clustering with reverse geocoding and basic UI — 1–1.5 weeks. Full implementation with trip splitting, visual grouping of non-GPS photos, and map — 3–4 weeks. Cost calculated individually.







