Price Aggregator (Price Comparison) Development

Our company is engaged in the development, support and maintenance of sites of any complexity. From simple one-page sites to large-scale cluster systems built on micro services. Experience of developers is confirmed by certificates from vendors.

8+Years of workmore info 900+Completed projectsmore info 100+In house employeesmore info 19+Partnersmore info

Development and maintenance of all types of websites:

Informational websites or web applications

Business card websites, landing pages, corporate websites, online catalogs, quizzes, promo websites, blogs, news resources, informational portals, forums, aggregators

E-commerce websites or web applications

Online stores, B2B portals, marketplaces, online exchanges, cashback websites, exchanges, dropshipping platforms, product parsers

Business process management web applications

CRM systems, ERP systems, corporate portals, production management systems, information parsers

Electronic service websites or web applications

Classified ads platforms, online schools, online cinemas, website builders, portals for electronic services, video hosting platforms, thematic portals

These are just some of the technical types of websites we work with, and each of them can have its own specific features and functionality, as well as be customized to meet the specific needs and goals of the client.

Offered services

Showing 1 of 1 servicesAll 2065 services

Price Aggregator (Price Comparison) Development

Complex

from 2 weeks to 3 months

FAQ

Our competencies:

Free consultation

Book a free consultation if you have any questions. A dedicated specialist will advise you.

Cost calculation

If you know what exactly you need to develop, or you already have a ready-made technical task.

Development stages

Latest works

Development of a web application for FEEDME
1170
Development of an online store for the company FURNORO
1094
Development of a web application for Enviok
830
CRM development for Chasseurs
879
Website development for SBH Partners
999
Website development for Red Pear
453

Show more works

Price Aggregator Development

Price aggregator collects prices on identical products from different stores and shows together. User sees cheapest and clicks. Technically: parsing, data normalization, product matching. Each stage non-trivial at scale.

Data Sources

Data arrives three ways:

Price Lists and Feeds — store provides YML, XML, CSV with current assortment. Most reliable: structured, official partnership, no ban risk. Yandex.Market YML — de-facto standard for Russian market.

Partner APIs — some stores provide REST API. Documentation weak, request limits strict.

Web Parsing — for feedless stores. High risk: captcha, rate limiting, markup changes, IP blocking. Constant maintenance required.

Start with feeds and API — more stable. Parse selectively for key sources.

Data Collector Architecture

Scheduler (Celery Beat / Laravel Scheduler)
  ↓ every N hours
FeedFetcher workers (one per source)
  ↓
RawData storage (S3 or local FS)
  ↓
Parser workers (XML/CSV/JSON → normalized)
  ↓
Normalizer (unit conversion, text cleanup)
  ↓
Matcher (map to DB products)
  ↓
PriceHistory (timeseries write)
  ↓
ElasticsearchIndexer (update index)

Queue: Celery + Redis for Python, Laravel Horizon for PHP. Each feed independent, source error doesn't block others.

Product Matching

Hardest part. Task: determine Samsung Galaxy A55 128GB Blue from shop A and Smartphone Samsung Galaxy A55 (SM-A556B) 128 Гб синий from shop B are same.

Deterministic:

GTIN/EAN: if both have barcode — exact match
MPN: manufacturer sku unique per brand
URL canonicalization: some stores include GTIN in URL

Fuzzy:

from rapidfuzz import fuzz

def match_score(title_a: str, title_b: str, brand_a: str, brand_b: str) -> float:
    if brand_a.lower() != brand_b.lower():
        return 0.0
    title_similarity = fuzz.token_sort_ratio(title_a, title_b)
    return title_similarity / 100

Threshold: 0.85+ auto-match, 0.65–0.85 manual review, below new product.

ML approach: product name embeddings (sentence-transformers, ruBERT) + cosine similarity. Much accurate especially for different formulations. Model trained on confirmed matches.

Price History

Main value — not current price but change history. Each price change recorded, not overwritten.

price_history (
  id BIGSERIAL,
  source_offer_id BIGINT,
  price NUMERIC(12,2),
  in_stock BOOLEAN,
  recorded_at TIMESTAMPTZ DEFAULT NOW()
)

For PostgreSQL timeseries use TimescaleDB — extension auto-partitions by time, speeds queries. Alternative — InfluxDB, ClickHouse for high loads.

Graph — standard product page component. Chart.js or Recharts, aggregate by day: SELECT date_trunc('day', recorded_at), min(price) FROM price_history.

SEO Strategy

Aggregators generate organic traffic on product pages. Key queries: "[product name] buy", "[product name] price", "[product name] cheap".

Each canonical product page: unique title with price range
Structured data: Product + AggregateOffer with lowPrice, highPrice, offerCount
Static category pages with aggregated stats
Blog reviews and curations — long-term SEO traffic

Timeline

MVP: feeds from 3–5 sources, manual matching, product pages, basic search — 8–12 weeks
Full aggregator: automatic matching (fuzzy + ML), price graphs, store cabinet, partner tracking — 20–30 weeks
Each new source (parsing): 3–7 work days depending complexity

Aggregator requires operational support: sources change structure, products need re-matching, new stores connect. Not one-off project but platform with support team.