Setting up cohort analysis for mobile app
Retention — the most important product metric. Day-1 retention 40% and Day-7 retention 20% mean different products with different problems. Cohort analysis answers: how do users who installed the app in the same week behave after 1, 7, 14, 30 days?
Aggregated DAU hides degradation: if new users arrive faster than old leave, DAU grows — but retention falls. Cohorts reveal this.
What's needed for cohort analysis
Two mandatory conditions: stable user_id and activation event. Without them, cohort doesn't build correctly.
User ID must be same on app reinstall — if generating new each time, user always in "new" cohort. Solutions:
- iOS:
Keychainto store generated UUID (survives app deletion) - Android:
AccountManageror server-side ID after registration - After authorization:
Analytics.setUserId(serverUserId)→ user tied to account
Activation event — first action showing product value. For different apps different:
| App type | Activation event |
|---|---|
| Marketplace | first_purchase |
| Streaming | content_played (3+ minutes) |
| Fitness | workout_completed |
| Game | level_2_started |
| Social network | first_post or 5_connections |
Activation event choice affects what cohort shows. app_open — too broad, includes random users. premium_purchase — too narrow for full audience retention analysis.
Implementing cohort analysis
Firebase / BigQuery
Firebase builds Retention Chart in Analytics section itself, but with limited flexibility. For deep analysis — export raw data to BigQuery via Firebase → Integrations → BigQuery. Then SQL:
-- Cohort by installation weeks, 7-day retention
WITH cohorts AS (
SELECT
user_pseudo_id,
DATE_TRUNC(MIN(PARSE_DATE('%Y%m%d', event_date)), WEEK) AS cohort_week,
MIN(event_timestamp) AS first_open_ts
FROM `project.analytics_*.events_*`
WHERE event_name = 'first_open'
GROUP BY user_pseudo_id
),
activity AS (
SELECT DISTINCT
user_pseudo_id,
DATE_TRUNC(PARSE_DATE('%Y%m%d', event_date), WEEK) AS activity_week
FROM `project.analytics_*.events_*`
WHERE event_name = 'session_start'
)
SELECT
c.cohort_week,
DATE_DIFF(a.activity_week, c.cohort_week, WEEK) AS week_number,
COUNT(DISTINCT c.user_pseudo_id) AS cohort_size,
COUNT(DISTINCT a.user_pseudo_id) AS retained_users,
ROUND(COUNT(DISTINCT a.user_pseudo_id) / COUNT(DISTINCT c.user_pseudo_id) * 100, 1) AS retention_pct
FROM cohorts c
LEFT JOIN activity a ON c.user_pseudo_id = a.user_pseudo_id
GROUP BY 1, 2
ORDER BY 1, 2
This query gives retention table by weeks.
Amplitude
In Amplitude, cohort analysis — native tool in Retention Analysis section. Configure:
-
Starting Event —
first_openor activation event -
Return Event —
session_startorapp_open - Grouping by days/weeks
- Breakdown by: install source, platform, app version
Amplitude allows comparing cohorts side-by-side — convenient to see if retention improved after product update.
Mixpanel
In Mixpanel Retention section builds classic retention matrix. Additionally Lifecycle — shows what percentage of users "resurrect" after long absence. For casual games this is important metric.
Behavioral cohorts
Besides time cohorts (by install date), useful behavioral cohorts — groups of users who did certain action:
# Example logic in BigQuery:
# Cohort: users who completed onboarding
# Question: what's their retention vs users who skipped onboarding?
If completed onboarding retention 2x higher — proof that onboarding needs improvement, not cutting.
What's included in the work
- Audit of user_id stability in current implementation
- Determining activation events with product manager
- Setting up cohort analysis in Firebase + BigQuery / Amplitude / Mixpanel
- SQL queries for custom cohort reports
- Setting up behavioral cohorts for key hypotheses
- Documentation and team handoff
Timeline
Cohort analysis setup in ready tool (Amplitude/Mixpanel): 1–2 days. BigQuery + custom SQL queries: 2–4 days. Cost calculated individually.







