Automatic Product Attributes Mapping with Website Filters

Our company is engaged in the development, support and maintenance of sites of any complexity. From simple one-page sites to large-scale cluster systems built on micro services. Experience of developers is confirmed by certificates from vendors.
Development and maintenance of all types of websites:
Informational websites or web applications
Business card websites, landing pages, corporate websites, online catalogs, quizzes, promo websites, blogs, news resources, informational portals, forums, aggregators
E-commerce websites or web applications
Online stores, B2B portals, marketplaces, online exchanges, cashback websites, exchanges, dropshipping platforms, product parsers
Business process management web applications
CRM systems, ERP systems, corporate portals, production management systems, information parsers
Electronic service websites or web applications
Classified ads platforms, online schools, online cinemas, website builders, portals for electronic services, video hosting platforms, thematic portals

These are just some of the technical types of websites we work with, and each of them can have its own specific features and functionality, as well as be customized to meet the specific needs and goals of the client.

Showing 1 of 1 servicesAll 2065 services
Automatic Product Attributes Mapping with Website Filters
Complex
~5 business days
FAQ
Our competencies:
Development stages
Latest works
  • image_web-applications_feedme_466_0.webp
    Development of a web application for FEEDME
    1161
  • image_ecommerce_furnoro_435_0.webp
    Development of an online store for the company FURNORO
    1041
  • image_crm_enviok_479_0.webp
    Development of a web application for Enviok
    822
  • image_crm_chasseurs_493_0.webp
    CRM development for Chasseurs
    847
  • image_website-sbh_0.png
    Website development for SBH Partners
    999
  • image_website-_0.png
    Website development for Red Pear
    451

Implementing Automatic Attribute Filter Mapping

Suppliers call the same attributes differently: "color", "Color of product", "colour", "RAL". On site there's one filter "Color". Manual mapping of attributes from dozens of suppliers to site filters is hundreds of hours that must be automated.

Mapping System Architecture

Three abstraction levels:

Supplier Attribute  →  Site Attribute  →  Filter Value
"Color: Red"        →  color           →  "Red"
"color: Red"        →  color           →  "Red"  (with normalization)
"RAL 3020"          →  color           →  "Red"  (via RAL reference)

Attribute name mapping table:

CREATE TABLE attribute_name_mapping (
    id               serial PRIMARY KEY,
    supplier_id      int,          -- NULL = universal mapping
    supplier_name    varchar(200), -- how supplier calls it
    site_attribute   varchar(100), -- internal attribute key
    created_at       timestamptz DEFAULT now(),
    UNIQUE (supplier_id, supplier_name)
);

Attribute value mapping table:

CREATE TABLE attribute_value_mapping (
    id               serial PRIMARY KEY,
    site_attribute   varchar(100),
    supplier_value   varchar(500),
    site_value       varchar(500),  -- normalized value
    supplier_id      int,           -- NULL = for all
    UNIQUE (site_attribute, supplier_value, supplier_id)
);

Mapping Algorithm

class AttributeMapper
{
    public function mapAttribute(
        string $supplierName,
        string $supplierValue,
        ?int   $supplierId = null
    ): ?MappedAttribute {
        // Step 1: find attribute name mapping
        $siteAttribute = $this->resolveAttributeName($supplierName, $supplierId);
        if (!$siteAttribute) {
            $this->logUnknownAttribute($supplierName, $supplierId);
            return null;
        }

        // Step 2: find value mapping
        $siteValue = $this->resolveAttributeValue($siteAttribute, $supplierValue, $supplierId);
        if (!$siteValue) {
            // Try auto-normalize without mapping
            $siteValue = $this->autoNormalizeValue($siteAttribute, $supplierValue);
        }

        return new MappedAttribute($siteAttribute, $siteValue);
    }

    private function resolveAttributeName(string $name, ?int $supplierId): ?string
    {
        $normalized = mb_strtolower(trim($name));

        // First supplier-specific mapping
        if ($supplierId) {
            $mapping = AttributeNameMapping::where([
                'supplier_id'   => $supplierId,
                'supplier_name' => $normalized,
            ])->value('site_attribute');
            if ($mapping) return $mapping;
        }

        // Then universal
        return AttributeNameMapping::whereNull('supplier_id')
            ->where('supplier_name', $normalized)
            ->value('site_attribute');
    }
}

Auto-Normalization of Values

Some values don't require dictionary mapping — can be brought to standard automatically:

private function autoNormalizeValue(string $attribute, string $raw): string
{
    return match ($attribute) {
        'spec_weight_kg' => $this->parseWeight($raw),
        'spec_dimensions' => $this->parseDimensions($raw),
        'spec_voltage'  => $this->parseVoltage($raw),
        default         => $this->capitalizeFirst($raw),
    };
}

private function parseWeight(string $raw): string
{
    // "2.5 kg" | "2500 g" | "2.5kg" → "2.5"
    if (preg_match('/(\d+[.,]\d+|\d+)\s*(g|kg)/iu', $raw, $m)) {
        $value = (float) str_replace(',', '.', $m[1]);
        $unit = strtolower($m[2]);
        return (string) ($unit === 'g' ? $value / 1000 : $value);
    }
    return $raw;
}

Fuzzy Value Mapping

For new values not yet in mapping, use pg_trgm similarity:

public function suggestValueMapping(string $attribute, string $supplierValue): array
{
    $normalized = mb_strtolower(trim($supplierValue));

    return DB::select(
        "SELECT site_value,
                similarity(lower(supplier_value), ?) AS score
         FROM attribute_value_mapping
         WHERE site_attribute = ?
           AND similarity(lower(supplier_value), ?) > 0.6
         ORDER BY score DESC
         LIMIT 5",
        [$normalized, $attribute, $normalized]
    );
}

Suggest to operator: "Looks like "Red." = "Red" (score 0.82). Create mapping?"

Auto-Learning from Confirmed Mappings

When operator accepts suggested mapping — it goes into attribute_value_mapping. On next import from this supplier, value maps automatically.

public function confirmMapping(
    string $attribute,
    string $supplierValue,
    string $siteValue,
    ?int   $supplierId
): void {
    AttributeValueMapping::updateOrCreate(
        [
            'site_attribute'  => $attribute,
            'supplier_value'  => mb_strtolower(trim($supplierValue)),
            'supplier_id'     => $supplierId,
        ],
        ['site_value' => $siteValue]
    );
}

Filter Management: Binding Attributes and Facets

Each mapped attribute bound to site filter:

CREATE TABLE filter_attributes (
    filter_id    int REFERENCES filters(id),
    attribute    varchar(100),
    display_name varchar(200),
    sort         smallint,
    PRIMARY KEY (filter_id, attribute)
);

After mapping, values written to denormalized product_filter_values table — exactly what faceted search uses:

CREATE TABLE product_filter_values (
    product_id   int,
    filter_id    int,
    value        varchar(500),
    value_slug   varchar(500),
    PRIMARY KEY (product_id, filter_id, value)
);
CREATE INDEX pfv_filter_value_idx ON product_filter_values (filter_id, value_slug);

Unrecognized Attributes: Processing Queue

All attributes without mapping accumulate in queue:

CREATE TABLE unmapped_attributes (
    id             serial PRIMARY KEY,
    supplier_id    int,
    attribute_name varchar(200),
    sample_values  text[],       -- up to 10 examples
    occurrences    int DEFAULT 1,
    first_seen_at  timestamptz DEFAULT now()
);

In admin — table with frequency. Most frequent attributes are first candidates for mapping creation.

Implementation Timeline

  • Basic attribute name mapping (dict + query) + product_filter_values write — 3 days
  • Auto-normalization of numeric values (weight, sizes, voltage) — +1 day
  • Fuzzy suggestions + unmapped queue + admin UI — +2–3 days
  • Full faceted search integration and recompute on mapping changes — +1–2 days