Cohort Retention Analysis for Mobile Apps
Most common scenario: marketing reports growth in installs, but MAU stays flat. The problem is without cohort retention analysis the product doesn't see when exactly users leave—on day 2, in a week, or after first transaction. Metrics like Day 1, Day 7, Day 30 Retention aren't just dashboard numbers—they're diagnostic tools pointing to the exact drop-off point.
What is a Cohort and Why Aggregate Retention Lies
A cohort is a group of users united by a single event date. Most often it's installation date (install_date), less often—date of first purchase or registration.
Without cohort breakdown, retention is calculated as: active today / active in period. This averages across all users, mixing new and old. An app can show stable 30-day retention of 20%, while recent cohorts degrade to 8%—old loyal users simply pull the average up.
Cohort analysis calculates for each cohort separately:
Day N Retention = unique_users_active_on_day_N / cohort_size
Where Day 0 is install date, Day 1 is the next calendar day (not 24 hours). This interpretation difference matters: Firebase defaults to calendar days in user timezone, Amplitude and Mixpanel are configurable both ways.
Instrumentation and Event Architecture
What to Track on Client
Minimum set for retention analysis:
-
app_open— launch fact (Firebase Analytics logs this automatically assession_start) -
user_engagement— Firebase calculates this, but better to define your ownmeaningful_action—action showing user found value -
install— install attribution, needed to correctly define cohort_date
On iOS via Firebase SDK:
// AppDelegate or SceneDelegate
Analytics.logEvent("meaningful_action", parameters: [
"action_type": "first_purchase" as NSObject,
"item_category": product.category as NSObject
])
On Android (Kotlin):
firebaseAnalytics.logEvent("meaningful_action") {
param("action_type", "first_purchase")
param("item_category", product.category)
}
Key mistake—logging app_open instead of meaningful_action. Then retention counts from app launch, not real usage. User accidentally opened app—now they're "retained" that day.
BigQuery + Firebase: Real Cohort Analysis
Firebase Console shows retention only as averaged curve. For cohort tables you need BigQuery export (free Spark plan Firebase supports with limits).
After connecting BigQuery, events flow into tables like events_YYYYMMDD. Query for Day 0–7 cohort table:
WITH installs AS (
SELECT
user_pseudo_id,
DATE(TIMESTAMP_MICROS(event_timestamp), "America/New_York") AS cohort_date
FROM `project.analytics_XXXXXXXXX.events_*`
WHERE event_name = 'first_open'
),
activity AS (
SELECT
user_pseudo_id,
DATE(TIMESTAMP_MICROS(event_timestamp), "America/New_York") AS activity_date
FROM `project.analytics_XXXXXXXXX.events_*`
WHERE event_name = 'session_start'
)
SELECT
i.cohort_date,
COUNT(DISTINCT i.user_pseudo_id) AS cohort_size,
DATE_DIFF(a.activity_date, i.cohort_date, DAY) AS day_n,
COUNT(DISTINCT a.user_pseudo_id) AS retained_users,
ROUND(COUNT(DISTINCT a.user_pseudo_id) / COUNT(DISTINCT i.user_pseudo_id), 3) AS retention_rate
FROM installs i
LEFT JOIN activity a
ON i.user_pseudo_id = a.user_pseudo_id
AND a.activity_date BETWEEN i.cohort_date AND DATE_ADD(i.cohort_date, INTERVAL 30 DAY)
GROUP BY 1, 3
ORDER BY 1, 3
This query produces: each row is cohort + day + retention rate. Build heatmap in Looker Studio, Data Studio, or directly in Metabase.
Amplitude and Mixpanel as Alternative
For products without BigQuery expertise, Amplitude is more convenient. Built-in Retention Analysis builds cohort tables in clicks. But configure User ID properly:
On iOS must pass stable identifier before first identify:
Amplitude.instance().setUserId(user.stableId)
Amplitude.instance().logEvent("meaningful_action")
If userId isn't set, Amplitude creates device-based identity—one user with two devices counts as two. Retention underestimates.
Common Cohort Setup Mistakes
Timezone mixing. If server logs events in UTC and Firebase counts Day N by user's local time—cohorts scatter. User installed at 23:50 Moscow time, server logged next UTC day. Cohort shifts by a day.
Recalculating cohort_date on reinstall. After delete and reinstall Firebase generates new instance_id and new first_open. User enters new cohort. If not accounted for, retention of cohorts underestimates—returning users look like new.
Small cohorts and statistical noise. Cohort of 15 users gives meaningless numbers: ±1 user is ±7% retention. Cohort analysis gives reliable data at 200–300 users per cohort minimum.
Visualization and Product Insights
Standard heatmap looks like:
| Cohort | Size | Day 1 | Day 3 | Day 7 | Day 14 | Day 30 |
|---|---|---|---|---|---|---|
| 2024-01-01 | 420 | 38% | 22% | 14% | 9% | 6% |
| 2024-01-08 | 380 | 41% | 25% | 16% | 11% | 7% |
| 2024-01-15 | 510 | 29% | 18% | 11% | 7% | 4% |
Jan 15 cohort is sharply worse—coincides with version 2.3.0 release. Product sees this immediately and rolls back or fixes before degradation spreads across user base.
What's Included
- Audit current event schema, verify
first_open/meaningful_actionpresence - BigQuery export configuration from Firebase or Amplitude Retention setup
- SQL queries for cohort tables accounting for timezone
- Dashboard in Looker Studio / Metabase / Redash
- Documentation: event dictionary, cohort_date logic description
Timeline
Setup from scratch: 3–5 days (depends on current analytics state and BigQuery access). If events already configured—1–2 days for queries and dashboard. Cost calculated individually after requirements analysis.







