Order book data processing pipeline for ML

We design and develop full-cycle blockchain solutions: from smart contract architecture to launching DeFi protocols, NFT marketplaces and crypto exchanges. Security audits, tokenomics, integration with existing infrastructure.
Showing 1 of 1 servicesAll 1306 services
Order book data processing pipeline for ML
Complex
~1-2 weeks
FAQ
Blockchain Development Services
Blockchain Development Stages
Latest works
  • image_web-applications_feedme_466_0.webp
    Development of a web application for FEEDME
    1170
  • image_ecommerce_furnoro_435_0.webp
    Development of an online store for the company FURNORO
    1092
  • image_logo-advance_0.png
    B2B Advance company logo design
    563
  • image_crm_enviok_479_0.webp
    Development of a web application for Enviok
    830
  • image_logo-aider_0.jpg
    AIDER company logo development
    763
  • image_crm_chasseurs_493_0.webp
    CRM development for Chasseurs
    878

Order Book Data ML Pipeline Development

Order book data — richest source of market structure information. Full order stack contains information about expected supply/demand unavailable from OHLCV data. However volume and structure of this data require specialized pipeline.

Order book levels:

  • Level 1 (Top of Book): best bid and ask with volumes. Minimal volume, maximum relevance.
  • Level 2 (Full Depth): all stack levels with volumes. Binance provides 5000 levels depth. Updates via WebSocket diff stream.
  • Level 3 (Full Order Feed): each individual order with ID. Not available on all exchanges, maximum detail.

Order Book Imbalance (OBI) - most researched feature for short-term forecasting:

OBI = (bid_volume - ask_volume) / (bid_volume + ask_volume)

Positive OBI indicates buying pressure, negative indicates selling pressure.

Feature engineering from order book: OBI on different levels, OBI moving average, OBI change, spread dynamics, depth stability, weighted mid price, depth asymmetry.

Storage: ClickHouse for order book data - high write speed, efficient columnar storage, fast aggregations. Level 2 snapshots every 100ms consume ~69M records/day.

Short-term price prediction: predict mid-price change through N order book updates (~1 second) using OBI and depth features. LightGBM/XGBoost for model.

Develop complete order book ML pipeline: WebSocket collector with incremental update, ClickHouse storage, feature engineering from OBI and depth data, short-term prediction model training and realtime inference.