Multi-exchange market data aggregation system

We design and develop full-cycle blockchain solutions: from smart contract architecture to launching DeFi protocols, NFT marketplaces and crypto exchanges. Security audits, tokenomics, integration with existing infrastructure.
Showing 1 of 1 servicesAll 1306 services
Multi-exchange market data aggregation system
Medium
~1-2 weeks
FAQ
Blockchain Development Services
Blockchain Development Stages
Latest works
  • image_web-applications_feedme_466_0.webp
    Development of a web application for FEEDME
    1170
  • image_ecommerce_furnoro_435_0.webp
    Development of an online store for the company FURNORO
    1092
  • image_logo-advance_0.png
    B2B Advance company logo design
    563
  • image_crm_enviok_479_0.webp
    Development of a web application for Enviok
    830
  • image_logo-aider_0.jpg
    AIDER company logo development
    763
  • image_crm_chasseurs_493_0.webp
    CRM development for Chasseurs
    879

Multi-Exchange Data Aggregation System Development

Trading systems working simultaneously on multiple exchanges face a fundamental problem: each exchange has its own WebSocket API, its own data format, its own request frequency limitations, and its own behavioral quirks. An aggregator transforms this zoo into a single normalized stream.

Aggregator Architecture

The system is built on the fan-in principle: multiple data sources are collected into a single normalized stream.

Exchange Connectors — a separate module for each exchange. Responsible for WebSocket connection setup, subscribing to needed channels, handling reconnects and errors, parsing exchange raw format into normalized format.

Normalization Layer — transforms exchange-specific formats into a unified schema. Binance calls the best bid field b, Kraken — b too, but with different semantics. OKX uses nanoseconds for timestamp, Bitfinex — milliseconds.

Distribution Layer — publishes normalized events to the bus (Redis Streams, Kafka) for downstream consumers.

Normalized Format

Universal ticker event schema:

{
  "exchange": "binance",
  "symbol": "BTC/USDT",
  "timestamp": 1704067200000,
  "received_at": 1704067200045,
  "bid": 43250.50,
  "ask": 43251.00,
  "last": 43250.75,
  "volume_24h": 28450.123,
  "open_24h": 42800.00
}

The received_at field — time of data receipt by the aggregator, different from exchange timestamp. The difference between them — network latency to the exchange, a useful monitoring metric.

Working with Rate Limits

Each exchange limits the number of requests. WebSocket connections are usually unlimited by message count, but there are limits on the number of subscriptions per connection (Binance: 1024 streams per connection) and the rate of sending subscription commands.

A proper connector manages the subscription queue considering these limits:

class ExchangeConnector:
    MAX_SUBSCRIPTIONS_PER_CONN = 1000
    SUBSCRIPTION_RATE_LIMIT = 10  # per second

    async def subscribe_symbols(self, symbols: list[str]):
        # Split into chunks by connection size
        for chunk in chunks(symbols, self.MAX_SUBSCRIPTIONS_PER_CONN):
            conn = await self.create_connection()
            # Rate-limit subscriptions
            async with self.rate_limiter:
                await conn.subscribe(chunk)

Reconnection and Data Gaps

WebSocket connections break. Exchanges sometimes send "ping" and expect "pong" within a strictly defined time (Binance: 10 minutes without pong = disconnect). A proper connector:

  • Automatically responds to ping-frames
  • Tracks time of last message (heartbeat check)
  • On disconnection — exponential backoff reconnect with jitter
  • On recovery — resubscribes to all symbols
  • Publishes GAP_DETECTED event with time range of missing data

Downstream consumers must handle GAP events correctly, especially if using rolling aggregations.

Latency and Time Synchronization

When comparing prices across exchanges, time synchronization is critical. Server system time must be synchronized via NTP with 1–5ms precision. Most cloud providers provide accurate NTP, but this should be verified.

Exchanges have different network latencies — from 1ms (co-location) to 50–100ms for regular servers. For arbitrage strategies, accounting for this delay is important.

Data Quality Monitoring

Metric Description
Message rate Messages per second per exchange/symbol
Latency (p50/p99) Delay from exchange to aggregator
Gap rate Number of data breaks per hour
Reconnect count Reconnection frequency
Stale data alerts Symbols without updates > X seconds

Prometheus + Grafana — standard stack for this monitoring.

Libraries and Ready Solutions

CCXT Pro — WebSocket extension of CCXT with support for 50+ exchanges. A good starting point for prototyping, but production often requires custom connectors due to performance and specific requirements.

cryptofeed (Python) — specialized library for cryptocurrency feeds with support for 30+ exchanges, data normalization, and backends for Kafka, Redis, RabbitMQ, PostgreSQL.

For high-performance systems (< 1ms latency), write connectors from scratch in Rust or Go.