Setting Up Synthetic Monitoring (Regular Critical Path Checks)
Synthetic Monitoring—simulating real user actions on schedule. Script opens browser, goes through scenario (e.g., checkout), records: does each step work, how long does it take, where are errors? This detects problems before users notice.
Difference from Uptime Monitoring
Simple ping to GET / — not synthetic monitoring. It checks server responds. Synthetic monitoring checks business functions work:
- User can register
- Search returns results
- Add to cart works
- Payment form opens
Critical Paths for Monitoring
Priority—paths critical for revenue and retention:
E-commerce:
- Search product → product page → add to cart → start checkout
- Login → account → order history
- User registration
SaaS application:
- Login → dashboard → key feature
- Create new object (project/document/task)
- API endpoints (for B2B clients)
Content site:
- Search → results → article
- Newsletter signup form
- Contact form
Playwright-based Synthetic Monitoring with Checkly
// checkly: checkout-flow.spec.js
const { chromium } = require('playwright')
const { expect } = require('@playwright/test')
async function checkoutFlow() {
const browser = await chromium.launch()
const page = await browser.newPage()
try {
// Step 1: Open catalog
await page.goto('https://example.com/catalog')
await expect(page.locator('.product-grid')).toBeVisible()
// Step 2: Open first product
await page.locator('.product-card').first().click()
await expect(page.locator('[data-testid="product-title"]')).toBeVisible()
// Step 3: Add to cart
await page.locator('[data-testid="add-to-cart"]').click()
await expect(page.locator('[data-testid="cart-count"]')).toContainText('1')
// Step 4: Go to cart
await page.goto('https://example.com/cart')
await expect(page.locator('.cart-items')).toContainText('1 item')
console.log('Checkout flow: PASS')
} finally {
await browser.close()
}
}
Datadog Synthetic Tests
# Create Synthetic Browser Test via API
import requests
test_config = {
"name": "Checkout Flow - Production",
"type": "browser",
"config": {
"request": {
"url": "https://example.com",
"method": "GET"
}
},
"options": {
"tick_every": 300, # Every 5 minutes
"min_failure_duration": 120,
"min_location_failed": 2,
"retry": {"count": 2, "interval": 300}
},
"locations": [
"aws:eu-west-1",
"aws:us-east-1",
"aws:ap-northeast-1"
],
"status": "live",
"tags": ["team:frontend", "env:production"]
}
requests.post(
"https://api.datadoghq.com/api/v1/synthetics/tests/browser",
headers={"DD-API-KEY": API_KEY, "DD-APPLICATION-KEY": APP_KEY},
json=test_config
)
Playwright + GitHub Actions: Self-Hosted Synthetic Monitoring
For minimal budget—GitHub Actions Cron:
# .github/workflows/synthetic-monitoring.yml
name: Synthetic Monitoring
on:
schedule:
- cron: '*/5 * * * *' # Every 5 minutes
workflow_dispatch:
jobs:
check-critical-paths:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install Playwright
run: npx playwright install chromium
- name: Run synthetic checks
env:
APP_URL: ${{ vars.PRODUCTION_URL }}
TEST_USER_EMAIL: ${{ secrets.SYNTHETIC_USER_EMAIL }}
TEST_USER_PASSWORD: ${{ secrets.SYNTHETIC_USER_PASSWORD }}
run: npx playwright test tests/synthetic/
- name: Notify on failure
if: failure()
run: |
curl -X POST "$SLACK_WEBHOOK" \
-d '{"text": "Synthetic monitoring FAILED: checkout flow"}'
env:
SLACK_WEBHOOK: ${{ secrets.SLACK_WEBHOOK }}
Limitation: GitHub Actions Cron doesn't guarantee exact execution time—delays up to 5-15 minutes. For precise 1-minute monitoring, use managed service.
Managing Test Data
Synthetic tests require test accounts and data:
- Separate
[email protected]user with production account - Payment method: test Stripe card (4242 4242 4242 4242)
- Mark orders from synthetic user with tag, exclude from business analytics
- Regularly clean test data (cart, drafts)
Metrics and Alerts
Key synthetic monitoring metrics:
- Availability %—percentage of successful runs
- Step duration—time per individual step
- Total flow duration—total scenario time
- First failure step—where it broke
Alert: if 2 of 3 checks from different regions fail → critical alert to PagerDuty.
Timeline
- Checkly / Datadog Synthetic (managed)—2-3 days for scenarios
- GitHub Actions cron + Playwright—2-3 days
- Test accounts + data cleanup—1 day
- Alerts + incident management integration—1 day







