Scraping Failure Alerts (Email/Telegram)

Our company is engaged in the development, support and maintenance of sites of any complexity. From simple one-page sites to large-scale cluster systems built on micro services. Experience of developers is confirmed by certificates from vendors.

Development and maintenance of all types of websites:

Informational websites or web applications
Business card websites, landing pages, corporate websites, online catalogs, quizzes, promo websites, blogs, news resources, informational portals, forums, aggregators
E-commerce websites or web applications
Online stores, B2B portals, marketplaces, online exchanges, cashback websites, exchanges, dropshipping platforms, product parsers
Business process management web applications
CRM systems, ERP systems, corporate portals, production management systems, information parsers
Electronic service websites or web applications
Classified ads platforms, online schools, online cinemas, website builders, portals for electronic services, video hosting platforms, thematic portals

These are just some of the technical types of websites we work with, and each of them can have its own specific features and functionality, as well as be customized to meet the specific needs and goals of the client.

Showing 1 of 1 servicesAll 2065 services
Scraping Failure Alerts (Email/Telegram)
Simple
from 1 business day to 3 business days
FAQ

Our competencies:

Development stages

Latest works

  • image_web-applications_feedme_466_0.webp
    Development of a web application for FEEDME
    1171
  • image_ecommerce_furnoro_435_0.webp
    Development of an online store for the company FURNORO
    1094
  • image_crm_enviok_479_0.webp
    Development of a web application for Enviok
    831
  • image_crm_chasseurs_493_0.webp
    CRM development for Chasseurs
    879
  • image_website-sbh_0.png
    Website development for SBH Partners
    999
  • image_website-_0.png
    Website development for Red Pear
    453

Implementing Alerts on Scraping Failures (email/Telegram)

Scraper crashed at night — by morning data is stale, and nobody knows why. Alert system solves this: the right person gets notification at moment of failure, with enough context for diagnosis.

What Counts as Failure

Not every error needs alert. Single timeout — normal situation, worker will retry. Alert needed when:

  • Task exhausted all retries (moved to DLQ / failed finally)
  • Worker crashed itself (process crash, OOM)
  • Error percentage over last 15 minutes exceeded threshold (e.g., > 20%)
  • Scraping site didn't complete in expected time (watchdog timeout)
  • Page structure changed — parser returns empty data

Telegram Notification

import httpx
import textwrap

async def send_telegram_alert(bot_token: str, chat_id: str, event: dict):
    text = textwrap.dedent(f"""
        🔴 <b>Scraping Failure</b>

        <b>Site:</b> {event['site_name']}
        <b>URL:</b> <code>{event['url']}</code>
        <b>Error:</b> {event['error_type']}
        <b>Message:</b> <code>{event['error_message'][:300]}</code>
        <b>Attempts:</b> {event['attempts']}
        <b>Time:</b> {event['timestamp']}
    """).strip()

    async with httpx.AsyncClient() as client:
        await client.post(
            f"https://api.telegram.org/bot{bot_token}/sendMessage",
            json={"chat_id": chat_id, "text": text, "parse_mode": "HTML"},
            timeout=10,
        )

Email via SMTP / SendGrid

from sendgrid import SendGridAPIClient
from sendgrid.helpers.mail import Mail

def send_email_alert(to_email: str, event: dict):
    message = Mail(
        from_email='[email protected]',
        to_emails=to_email,
        subject=f"[Scraping] Failure: {event['site_name']}",
        html_content=render_alert_template(event),
    )
    sg = SendGridAPIClient(api_key=SENDGRID_API_KEY)
    sg.send(message)

Alert Deduplication

Without deduplication on mass failure (proxy provider crashed) 500 emails arrive in a minute. Solution — grouping by key with cooldown:

def should_send_alert(site_id: int, error_type: str, cooldown_minutes: int = 30) -> bool:
    key = f"alert_sent:{site_id}:{error_type}"
    if redis.exists(key):
        return False
    redis.setex(key, cooldown_minutes * 60, "1")
    return True

One alert per error type in 30 minutes — reasonable balance between informativeness and noise.

Implementation Timeline

Telegram + email alerts with deduplication — 1–2 business days.