Photo Stock Service Development
A photo stock is a media library with monetization. Technically, this intersects three domains: managing large binary objects, searching by metadata and visual content, and licensing transactions. Each is non-trivial independently; together they require carefully planned architecture from the start.
Stock Types and Technical Differences
Before designing a system, understand the business model:
Microstock (iStock, Shutterstock-style)—massive library, low price, subscription or credits. Key metric is search and conversion. Requires powerful visual search.
Macrostock / Editorial—smaller content, high prices, licensing on demand. Focus on rights management and documentation.
Niche stock—photos of specific topic (architecture, medicine, food). Simpler moderation, easier SEO.
Internal company photo archive—not public stock, but internal DAM (Digital Asset Management). Different requirements: CMS integration, role-based access.
File Upload and Storage
Minimum requirements for typical stock: JPEG/TIFF/PNG, minimum 4 MP, up to 200 MB. Video—MP4/MOV up to 4K, up to 2 GB. Direct server upload is excluded.
Multipart upload scheme via S3:
Client → presigned URL (S3) → direct upload to S3
S3 Event → SQS → Worker: generate previews, validate, metadata
Worker → DB: record asset status=processing → status=ready
Storage: AWS S3 or S3-compatible (MinIO for self-hosted). Bucket structure:
-
originals/—source files, private access, signed URLs only -
previews/—watermarked, public CDN -
thumbnails/—multiple sizes (400px, 800px, 1600px), generated on upload
For preview generation: ImageMagick or libvips (4–8x faster). Watermark applied during preview generation, not on delivery—otherwise can't change branding without regeneration.
Metadata and Search
Media file metadata lives in two places: EXIF/IPTC in file and database. On upload, parse IPTC tags (ExifTool), move to DB, let author add more manually.
Metadata structure:
assets (id, uuid, author_id, title, description, status, license_type, uploaded_at)
asset_tags (asset_id, tag_id)
asset_categories (asset_id, category_id)
asset_metadata (asset_id, key, value) -- EXIF, IPTC, custom fields
Full-text search: Elasticsearch with Russian analyzer. Index: title, description, tags, categories, author name, IPTC keywords. Boost by field: tags > title > description.
Visual search (find similar images): generate perceptual hash (pHash) on upload. Find similar—hamming distance between hashes. Advanced: CLIP embeddings via OpenAI API or local model, store in vector DB (pgvector or Qdrant).
Color search: extract dominant colors via k-means (Pillow / ColorThief), store HEX palette, index in Elasticsearch as keyword field with boost.
Licensing and Transactions
Licenses are core business logic. Minimum license types:
| Type | Description | Implementation |
|---|---|---|
| RF (Royalty Free) | One-time payment, multiple uses | Simple purchase, record in licenses |
| RM (Rights Managed) | Payment per use, depends on circulation | Calculator on purchase, detailed usage record |
| Editorial | News/editorial only, not ads | Flag in asset + check at checkout |
| Extended | Unlimited circulation, resale | Separate price, manual moderation |
File delivery after payment—one-time signed URL with 15–60 minute TTL. Not direct S3 link. Each delivery logged: user_id, asset_id, timestamp, IP.
Subscriptions and Credits System
Two monetization models often coexist:
Subscription: X downloads/month, specific permissions, rollover or reset. Via Stripe Subscriptions + webhooks. On download, check subscription.downloads_remaining, decrement atomically (Redis DECR).
Credits: user buys credit pack, spends on download. Different files cost different credits (by resolution, license type). Transactions in separate table with balance history—never store balance as mutable field without history.
Author Upload Tool
Author dashboard—separate system part. Key requirements:
- Batch upload: 50–200 files at once with progress bars
- Bulk metadata editing: select multiple, apply tags/categories to all
- Moderation pipeline: uploaded → reviewing → approved/rejected with comment
- Author analytics: views, downloads, revenue, top files
For batch upload: <input multiple> + chunked upload via tus protocol (resumable uploads). Client library—tus-js-client. Server—tusd or custom Laravel implementation.
Content Moderation
Automatic pre-moderation speeds manual review:
- NSFW detector: Google Cloud Vision SafeSearch or open model (NudeNet)—filter explicit content
- Duplicates: pHash comparison with approved files, hamming distance ≤ 10
- Technical issues: check minimum resolution, noise, sharpness via ImageMagick identify
After auto-check—moderator queue with priority (new authors checked stricter).
SEO and Indexing
Asset pages—primary SEO traffic. Each asset page:
- URL:
/photos/{category}/{slug}-{id}—readable, no hash - Title:
{title} — stock photo #{id}(unique) - Structured data:
ImageObjectschema.org withcontentUrl,author,license,keywords - Related photos: internal links by tags and categories
For catalogs with millions—XML sitemap split into index + category files, updated incrementally.
Performance
- Catalog pages cached at CDN (Cloudflare) with
Cache-Control: stale-while-revalidate - Previews served via CDN with immutable cache (filename includes content hash)
- Search—Elasticsearch, no SQL for queries
- Lazy load previews: Intersection Observer API, placeholder—dominant color from metadata
Timeline
- MVP (upload, tag search, RF license purchase, Stripe): 8–12 weeks
- Full stock (subscriptions, visual search, author dashboard with moderation, SEO): 20–30 weeks
- CLIP search or duplicate detection integration adds 2–4 weeks
Photo stocks are often underestimated as "catalog with files". The difference becomes clear at licensing and storage scaling.







