AI-System for Passenger Flow Forecasting
Passenger flow forecasting addresses operational challenges of public transport: determining when additional cars are needed, where to place extra staff, and how to optimize schedules. ML system accuracy: MAPE 8-15% for 1-hour horizon, enabling operational decisions within 1-2 hours.
Tasks by Transport Type
Metro/Subway:
- Forecast inbound/outbound flow at each station in 15-minute intervals
- Forecast train car occupancy on segments
- Optimize train dispatch intervals
Surface Transport (bus, trolleybus, tram):
- Forecast passenger flow at stops
- Forecast occupancy by route
- Plan vehicle fleet deployment
Rail and Air:
- Forecast ticket sales (closely related to demand forecasting)
- Forecast passenger flow at stations/airports for staffing
Data Sources
Transaction Data:
- AFC (Automatic Fare Collection): gate data — time, station, ticket type
- Bus validators: Validator ID, route, time
- Ticket sales through mobile apps and counters
Technical Data:
- CCTV with people counting (Vision-based people counting)
- Wi-Fi tracking: anonymized device sessions
- APC (Automatic Passenger Counting): door sensors on vehicles
External Data:
- Sports events, concerts (event calendar)
- Weather
- City events (parades, demonstrations)
- Mode switching (metro line closures)
Forecasting Models
Temporal Patterns: Station flow has stable patterns:
- Hourly: morning peak 07:30-09:30, evening 17:30-19:30
- Daily: weekdays vs. weekends fundamentally differ
- Seasonal: summer flow decreases 15-25%, holidays
# LightGBM with rich feature set
features = {
# Lags
'passengers_lag_15min': passengers_t_minus_1,
'passengers_lag_1h': passengers_t_minus_4,
'passengers_same_time_yesterday': passengers_same_period_yesterday,
'passengers_same_time_last_week': passengers_same_period_week_ago,
# Time
'hour': hour,
'minute': minute,
'day_of_week': dow,
'is_holiday': holiday_flag,
'month': month,
# External
'weather_rain': rain_intensity,
'temperature_c': temperature,
'stadium_event_distance_time': event_proximity_score,
# Station/route
'station_type': encode(terminal_transfer_intermediate),
'line_id': line_embedding
}
Graph Neural Network: For metro: network model as graph. Station flow depends on neighboring stations — passengers transfer, and closing one station redistributes flow.
Anomalous Events and Adjustments
Event Detection: Sudden passenger flow surge before station closure / after concert → anomaly.
def detect_flow_anomaly(actual, predicted, threshold_sigma=3.0):
residuals = actual - predicted
z_score = (residuals - residuals.rolling(168).mean()) / residuals.rolling(168).std()
return z_score.abs() > threshold_sigma
Detected anomaly → operations center operator receives alert → correction to operations plan.
Planned Events: Scheduled events entered as known future covariates (TFT). System automatically predicts +30% passenger flow in hour after concert ends at "Luzhniki".
Operational Applications
Interval Regulation: On predicted peak → Operations center receives recommendation: "In 45 minutes at 'Sportivnaya' station, flow expected +85% above normal. Recommend reducing interval from 3 to 1.5 min."
Station Staffing: Passenger flow forecast → calculation of needs for ticket agents, controllers → shift planning.
Operations Center Dashboard:
- Real-time passenger flow heatmap across network vs. forecast
- Forecast for 1/2/4 hours ahead
- Alerts on expected anomalies
- Forecast accuracy history
Integration:
- ASUPO (Passenger Operations Management System) — Russian standard for metro
- ACS (Access Control System): API for transaction data
- ECTS (Unified Transport Service Center)
Metrics:
- MAPE for 15-min forecast: < 10%
- Peak accuracy: < 15% for peak hours (most difficult)
- Early warning time: operational alert 60-90 min ahead
Timeline: basic model on AFC data for 1 station/route — 3-4 weeks. System for entire network with GNN, event-aware and operations center integration — 4-5 months.







