ClickHouse for Web Application Analytics
ClickHouse is a columnar database for analytical queries (OLAP). Aggregates billions of rows in seconds where PostgreSQL takes minutes. Not a replacement for transactional DB — complement: PostgreSQL for operational data, ClickHouse for analytics.
When ClickHouse is Needed
Typical queries PostgreSQL struggles with:
- "Daily audience last 90 days by traffic source"
- "Conversion funnel by steps for quarter"
- "Top-1000 products by revenue for year by region"
- "Cohort analysis: user retention by registration month"
For event tables > 100M rows, ClickHouse provides 10-100x speedup.
Table Schema
ClickHouse uses specialized engines. MergeTree is primary:
-- Events table (page views, clicks, purchases)
CREATE TABLE events (
event_date Date,
event_time DateTime,
event_type LowCardinality(String), -- enum-like: 'pageview', 'click', 'purchase'
user_id UInt64,
session_id String,
tenant_id UInt32,
page_url String,
referrer String,
country LowCardinality(String),
device_type LowCardinality(String), -- 'desktop', 'mobile', 'tablet'
properties String -- JSON for arbitrary attributes
) ENGINE = MergeTree()
ORDER BY (tenant_id, event_date, event_type, user_id)
PARTITION BY toYYYYMM(event_date);
-- Indexes (min/max for fast block skipping)
ALTER TABLE events ADD INDEX idx_session session_id TYPE bloom_filter(0.01) GRANULARITY 4;
ORDER BY is sort key ClickHouse uses for storage. Queries filtering by tenant_id and event_date use it for pruning.
LowCardinality optimizes columns with few unique values (~10k). Stores as dictionary encoding.
Data Insertion
Never insert one row at a time — ClickHouse optimizes for batch inserts:
// Node.js + @clickhouse/client
import { createClient } from '@clickhouse/client';
const client = createClient({
host: process.env.CLICKHOUSE_HOST,
username: process.env.CLICKHOUSE_USER,
password: process.env.CLICKHOUSE_PASSWORD,
database: 'analytics',
});
// Buffer events — flush every N seconds or M events
const eventBuffer = []
function flushEvents() {
if (eventBuffer.length === 0) return
await client.insert({
table: 'events',
values: eventBuffer,
format: 'JSONEachRow'
})
eventBuffer.length = 0
}
setInterval(flushEvents, 10000) // 10 seconds
Analytical Queries
-- Daily active users by country
SELECT
event_date,
country,
COUNT(DISTINCT user_id) AS dau
FROM events
WHERE event_date >= today() - 90
GROUP BY event_date, country
ORDER BY event_date DESC, dau DESC;
-- Cohort retention: % of users from cohort still active
SELECT
registration_cohort,
days_since_registration,
COUNT(DISTINCT user_id) AS active_users
FROM (
SELECT
user_id,
toDate(first_event) AS registration_cohort,
dateDiff('day', first_event, event_date) AS days_since_registration
FROM events
)
GROUP BY registration_cohort, days_since_registration;
-- Revenue funnel
SELECT
'view_product' AS step,
COUNT(DISTINCT user_id) AS users
FROM events
WHERE event_type = 'page_view' AND page_url LIKE '/product/%'
UNION ALL
SELECT
'add_to_cart' AS step,
COUNT(DISTINCT user_id) AS users
FROM events
WHERE event_type = 'add_to_cart'
Integration with Web Apps
// Send events to ClickHouse via API endpoint
async function trackEvent(eventData) {
const event = {
event_date: new Date().toISOString().split('T')[0],
event_time: new Date().toISOString(),
event_type: eventData.type,
user_id: userId,
session_id: sessionId,
tenant_id: tenantId,
page_url: window.location.pathname,
properties: JSON.stringify(eventData)
}
await fetch('/api/events', {
method: 'POST',
body: JSON.stringify(event)
})
}
// Track on page load, clicks, form submissions
trackEvent({ type: 'pageview' })
document.addEventListener('click', e => {
if (e.target.closest('[data-track]')) {
trackEvent({ type: 'click', element: e.target.id })
}
})
Delivery Time
Setting up ClickHouse cluster, schema design, data pipeline — 3–5 business days.







