AI-based forecasting of demand for fashion collections
Fashion is one of the most challenging markets to forecast demand for. Short SKU lifecycles (6-12 weeks), high dependence on trends and weather, and a lack of historical data on new SKUs all make traditional planning methods ineffective. Machine learning approaches reduce overstocks and stockouts by 20-35%.
Features of fashion forecasting
Cold Start Problem: New collection – no historical sales. Solutions:
- Attribute-based forecasting: forecasting based on characteristics (color, pattern, category, price segment)
- Transfer learning: similar to last season's article as anchor
- Analogous items: clustering new items to existing SKUs with history
Seasonality + Fashion Trend:
# Decomposition sales signal
# Sales = Seasonal × Category Trend × Fashion Trend × Price Effect × Random
# Fashion Trend: внешние сигналы (Instagram, Vogue, runway)
Short SKU Life: Classic time series require a long history. Instead, we use cross-sectional models at the SKU level.
Data sources
Internal:
- POS data by week: sales, returns, discounts
- Inventory data: balances, out-of-stock dates
- Product characteristics: category, brand, color, material, size, price
External trend signals:
- Google Trends: Search query dynamics by category
- Instagram/Pinterest: Engagement with fashion content (via API or scraping)
- Runway analysis: detecting trends from fashion shows (CV in photos from ModaOperandi, Vogue Runway)
- Weather data: Temperature directly affects jacket/swimsuit sales
Social Listening:
trend_features = {
'google_trends_category_4w': trends_api_value,
'instagram_hashtag_growth': hashtag_weekly_growth_rate,
'search_volume_brand': keyword_planner_volume,
'temperature_deviation': weather_vs_seasonal_norm,
'competitor_stockout_signal': scraped_inventory_depletion
}
Forecasting models
Attribute-based LightGBM: For each new product, peak week sales and sell-through rates are predicted based on attributes and trend features. Training is based on historical collections.
Cluster + Analogous Item:
from sklearn.cluster import KMeans
# Кластеризация по attribute embedding
def find_analogous_items(new_item_features, historical_items, n_clusters=50):
kmeans = KMeans(n_clusters=n_clusters)
labels = kmeans.fit_predict(historical_items['features'])
new_cluster = kmeans.predict([new_item_features])[0]
analogs = historical_items[labels == new_cluster]
return analogs.sort_values('similarity_score', ascending=False).head(5)
SKU Life Cycle - Sales Curve: Not all SKUs are created equal. Life cycle curve clustering:
- Type A: Fast start → smooth decline (bestseller)
- Type B: slow start → peak at 4 weeks (niche item)
- Type C: smooth sales, basic articles Forecast of curve shape → order distribution over time.
Pre-Season Planning vs. In-Season Adjustment
Pre-Season (6-9 months before the start):
- Initial order based on attribute forecast
- Buy quantities according to the size chart (size curve model)
- Open-to-buy budget by category
In-Season adjustment (weekly): After the first 2-3 weeks of actual sales - Bayesian update of the initial forecast:
def bayesian_forecast_update(prior_forecast, observed_sales, sell_through_weeks):
"""
Обновление прогноза по первым неделям
Sell-through rate в первые 2 недели = сильный предиктор финального результата
"""
early_st_rate = observed_sales / prior_forecast[:sell_through_weeks].sum()
scaling_factor = early_st_rate ** 0.7 # регрессия к среднему
return prior_forecast * scaling_factor
Reorder and Markdown triggers:
- If sell-through > 70% in week 4 → reorder (if possible according to production cycle)
- If sell-through < 30% in week 6 → markdown begins according to the Markdown calendar
Size distribution
Size Curve modeling: Historically: XS:S:M:L:XL = 5:20:35:25:15 for this category. ML adjusts for regions, channels, and price segments:
size_curve = lgbm.predict_proba(
category=category,
price_tier=price_tier,
channel=['online', 'store'],
region=region
)
# → оптимальное соотношение размеров в заказе
Last size problem: Stockout of one size = loss of entire sale. Optimization: small buffer of sizes with the least availability.
Evaluation Metrics
| Metric | Value |
|---|---|
| WAPE (Weighted APE) | < 30% для новых артикулов |
| Sell-through rate accuracy | ±10 pp |
| Stockout reduction | -25% vs. baseline |
| Overstock reduction | -20% vs. baseline |
| Markdown depth reduction | -3-5 pp |
Timeframe: Attribute-based forecasting + analogous item matching + in-season update — 6-8 weeks. A full system with size curve, social trend signals, and Markdown optimization — 3-4 months.







