Historical order data storage development

We design and develop full-cycle blockchain solutions: from smart contract architecture to launching DeFi protocols, NFT marketplaces and crypto exchanges. Security audits, tokenomics, integration with existing infrastructure.
Showing 1 of 1 servicesAll 1306 services
Historical order data storage development
Complex
~1-2 weeks
FAQ
Blockchain Development Services
Blockchain Development Stages
Latest works
  • image_web-applications_feedme_466_0.webp
    Development of a web application for FEEDME
    1170
  • image_ecommerce_furnoro_435_0.webp
    Development of an online store for the company FURNORO
    1092
  • image_logo-advance_0.png
    B2B Advance company logo design
    563
  • image_crm_enviok_479_0.webp
    Development of a web application for Enviok
    830
  • image_logo-aider_0.jpg
    AIDER company logo development
    763
  • image_crm_chasseurs_493_0.webp
    CRM development for Chasseurs
    878

Historical Order Data Storage Development

Order history is not just an archive. It's the primary data source for execution efficiency analysis, backtesting, commission calculation, tax reporting, and trading strategy audit. Properly designed storage should provide high write speed in real-time and analytical queries across multiple dimensions on years of data.

Order Data Structure

An order in a trading system is not simply a record "bought 1 BTC at 50000". The complete model includes several types of events:

Order Lifecycle Events:

  • ORDER_CREATED — initial creation with parameters
  • ORDER_ACCEPTED — exchange confirmed receipt
  • ORDER_PARTIALLY_FILLED — partial execution
  • ORDER_FILLED — complete execution
  • ORDER_CANCELLED — cancellation (with reason)
  • ORDER_REJECTED — exchange rejection (with error code)
  • ORDER_EXPIRED — timeout expired

Storing these events separately (event sourcing) provides complete reproducibility: you can always recover the state of any order at any point in time.

Database Schema

For order time series, TimescaleDB (PostgreSQL extension) or ClickHouse are optimal.

TimescaleDB — a good choice if you already use PostgreSQL. Automatically partitions tables by time (hypertables), supports continuous aggregations, and compression policies.

CREATE TABLE order_events (
    event_id        UUID DEFAULT gen_random_uuid(),
    event_time      TIMESTAMPTZ NOT NULL,
    order_id        UUID NOT NULL,
    exchange        VARCHAR(32) NOT NULL,
    symbol          VARCHAR(32) NOT NULL,
    event_type      VARCHAR(32) NOT NULL,
    side            VARCHAR(8),
    order_type      VARCHAR(16),
    price           NUMERIC(24, 8),
    quantity        NUMERIC(24, 8),
    filled_qty      NUMERIC(24, 8),
    avg_fill_price  NUMERIC(24, 8),
    commission      NUMERIC(24, 8),
    commission_asset VARCHAR(16),
    client_order_id VARCHAR(64),
    strategy_id     VARCHAR(64),
    metadata        JSONB
);

SELECT create_hypertable('order_events', 'event_time',
    chunk_time_interval => INTERVAL '1 day');

Indexes for typical query patterns:

CREATE INDEX ON order_events (order_id, event_time DESC);
CREATE INDEX ON order_events (exchange, symbol, event_time DESC);
CREATE INDEX ON order_events (strategy_id, event_time DESC);
CREATE INDEX ON order_events USING GIN (metadata);

Query Patterns and Optimization

Order state recovery — a frequent operation. Instead of recomputing from events each time, maintain a materialized orders table with current state:

CREATE TABLE orders (
    order_id        UUID PRIMARY KEY,
    exchange        VARCHAR(32) NOT NULL,
    symbol          VARCHAR(32) NOT NULL,
    status          VARCHAR(32) NOT NULL,
    side            VARCHAR(8) NOT NULL,
    order_type      VARCHAR(16) NOT NULL,
    price           NUMERIC(24, 8),
    quantity        NUMERIC(24, 8),
    filled_qty      NUMERIC(24, 8) DEFAULT 0,
    avg_fill_price  NUMERIC(24, 8),
    total_commission NUMERIC(24, 8) DEFAULT 0,
    strategy_id     VARCHAR(64),
    created_at      TIMESTAMPTZ NOT NULL,
    updated_at      TIMESTAMPTZ NOT NULL
);

Table updates occur on each new event through a trigger or application-side logic.

Analytical queries — typically aggregations by strategy, instrument, period:

-- P&L by strategies for period
SELECT
    strategy_id,
    symbol,
    SUM(CASE WHEN side = 'BUY' THEN -filled_qty * avg_fill_price ELSE filled_qty * avg_fill_price END) as realized_pnl,
    SUM(total_commission) as total_fees,
    COUNT(*) as order_count
FROM orders
WHERE created_at BETWEEN '2024-01-01' AND '2024-12-31'
  AND status = 'FILLED'
GROUP BY strategy_id, symbol
ORDER BY realized_pnl DESC;

TimescaleDB continuous aggregates allow pre-computing these aggregations and updating them incrementally:

CREATE MATERIALIZED VIEW orders_daily_summary
WITH (timescaledb.continuous) AS
SELECT
    time_bucket('1 day', created_at) AS day,
    exchange,
    symbol,
    strategy_id,
    COUNT(*) AS order_count,
    SUM(filled_qty * avg_fill_price) AS volume,
    SUM(total_commission) AS fees
FROM orders
WHERE status = 'FILLED'
GROUP BY 1, 2, 3, 4;

Storing Fills Separately

For detailed execution quality analysis, it's critical to store individual fills (partial executions) separately from orders:

CREATE TABLE order_fills (
    fill_id         UUID DEFAULT gen_random_uuid(),
    fill_time       TIMESTAMPTZ NOT NULL,
    order_id        UUID NOT NULL REFERENCES orders(order_id),
    exchange        VARCHAR(32) NOT NULL,
    symbol          VARCHAR(32) NOT NULL,
    price           NUMERIC(24, 8) NOT NULL,
    quantity        NUMERIC(24, 8) NOT NULL,
    commission      NUMERIC(24, 8),
    commission_asset VARCHAR(16),
    is_maker        BOOLEAN,
    trade_id        VARCHAR(64)
);

SELECT create_hypertable('order_fills', 'fill_time',
    chunk_time_interval => INTERVAL '1 day');

This allows calculating execution VWAP, comparing with mid-price at execution time (market impact), and analyzing maker/taker ratio by strategy.

Retention Policies and Archiving

Hot data (last 30 days) is stored without compression for maximum write and read speed. Older data is compressed:

ALTER TABLE order_events SET (
    timescaledb.compress,
    timescaledb.compress_segmentby = 'exchange, symbol',
    timescaledb.compress_orderby = 'event_time DESC'
);

SELECT add_compression_policy('order_events',
    INTERVAL '30 days');

TimescaleDB compression provides 10–20x size reduction for time series with repetitive values. In practice, an order_events table of 100GB compresses to 5–10GB.

Data older than 2 years can be exported to Parquet files on S3 using pg_parquet or custom ETL, while maintaining historical analysis capability through Athena or ClickHouse.

Ingestion Pipeline

High-frequency writes require batching. Instead of INSERT per event:

COPY for bulk inserts — 10–50x faster than individual INSERTs. Accumulate events in memory (100ms or 1000 events, whichever comes first) and write with a single COPY.

Unlogged tables for intermediate buffer — WAL is not written, write speed is much higher. After aggregation, data is moved to the main logged table.

Connection pooling through PgBouncer — trading systems open many short connections. PgBouncer in transaction mode allows serving thousands of clients through a small pool of actual PostgreSQL connections.

Monitoring and Alerts

Key metrics to monitor storage:

Metric Normal Alert
Write latency (p99) < 10ms > 50ms
Query latency (p99) < 100ms > 500ms
Replication lag < 1s > 10s
Disk usage growth Predictable Anomalous growth
Failed inserts 0 Any

Order loss is always a critical incident. The system must have a reconciliation mechanism: periodically compare local history with exchange data via REST API and fill gaps.

Replication and Fault Tolerance

Production storage operates in PostgreSQL streaming replication mode: primary for writes, replica for analytical queries. If primary fails — failover through Patroni or PgBouncer with automatic switching. RPO (Recovery Point Objective) with proper synchronous_commit configuration — zero.