Implementing Automatic Attribute Filter Mapping
Suppliers call the same attributes differently: "color", "Color of product", "colour", "RAL". On site there's one filter "Color". Manual mapping of attributes from dozens of suppliers to site filters is hundreds of hours that must be automated.
Mapping System Architecture
Three abstraction levels:
Supplier Attribute → Site Attribute → Filter Value
"Color: Red" → color → "Red"
"color: Red" → color → "Red" (with normalization)
"RAL 3020" → color → "Red" (via RAL reference)
Attribute name mapping table:
CREATE TABLE attribute_name_mapping (
id serial PRIMARY KEY,
supplier_id int, -- NULL = universal mapping
supplier_name varchar(200), -- how supplier calls it
site_attribute varchar(100), -- internal attribute key
created_at timestamptz DEFAULT now(),
UNIQUE (supplier_id, supplier_name)
);
Attribute value mapping table:
CREATE TABLE attribute_value_mapping (
id serial PRIMARY KEY,
site_attribute varchar(100),
supplier_value varchar(500),
site_value varchar(500), -- normalized value
supplier_id int, -- NULL = for all
UNIQUE (site_attribute, supplier_value, supplier_id)
);
Mapping Algorithm
class AttributeMapper
{
public function mapAttribute(
string $supplierName,
string $supplierValue,
?int $supplierId = null
): ?MappedAttribute {
// Step 1: find attribute name mapping
$siteAttribute = $this->resolveAttributeName($supplierName, $supplierId);
if (!$siteAttribute) {
$this->logUnknownAttribute($supplierName, $supplierId);
return null;
}
// Step 2: find value mapping
$siteValue = $this->resolveAttributeValue($siteAttribute, $supplierValue, $supplierId);
if (!$siteValue) {
// Try auto-normalize without mapping
$siteValue = $this->autoNormalizeValue($siteAttribute, $supplierValue);
}
return new MappedAttribute($siteAttribute, $siteValue);
}
private function resolveAttributeName(string $name, ?int $supplierId): ?string
{
$normalized = mb_strtolower(trim($name));
// First supplier-specific mapping
if ($supplierId) {
$mapping = AttributeNameMapping::where([
'supplier_id' => $supplierId,
'supplier_name' => $normalized,
])->value('site_attribute');
if ($mapping) return $mapping;
}
// Then universal
return AttributeNameMapping::whereNull('supplier_id')
->where('supplier_name', $normalized)
->value('site_attribute');
}
}
Auto-Normalization of Values
Some values don't require dictionary mapping — can be brought to standard automatically:
private function autoNormalizeValue(string $attribute, string $raw): string
{
return match ($attribute) {
'spec_weight_kg' => $this->parseWeight($raw),
'spec_dimensions' => $this->parseDimensions($raw),
'spec_voltage' => $this->parseVoltage($raw),
default => $this->capitalizeFirst($raw),
};
}
private function parseWeight(string $raw): string
{
// "2.5 kg" | "2500 g" | "2.5kg" → "2.5"
if (preg_match('/(\d+[.,]\d+|\d+)\s*(g|kg)/iu', $raw, $m)) {
$value = (float) str_replace(',', '.', $m[1]);
$unit = strtolower($m[2]);
return (string) ($unit === 'g' ? $value / 1000 : $value);
}
return $raw;
}
Fuzzy Value Mapping
For new values not yet in mapping, use pg_trgm similarity:
public function suggestValueMapping(string $attribute, string $supplierValue): array
{
$normalized = mb_strtolower(trim($supplierValue));
return DB::select(
"SELECT site_value,
similarity(lower(supplier_value), ?) AS score
FROM attribute_value_mapping
WHERE site_attribute = ?
AND similarity(lower(supplier_value), ?) > 0.6
ORDER BY score DESC
LIMIT 5",
[$normalized, $attribute, $normalized]
);
}
Suggest to operator: "Looks like "Red." = "Red" (score 0.82). Create mapping?"
Auto-Learning from Confirmed Mappings
When operator accepts suggested mapping — it goes into attribute_value_mapping. On next import from this supplier, value maps automatically.
public function confirmMapping(
string $attribute,
string $supplierValue,
string $siteValue,
?int $supplierId
): void {
AttributeValueMapping::updateOrCreate(
[
'site_attribute' => $attribute,
'supplier_value' => mb_strtolower(trim($supplierValue)),
'supplier_id' => $supplierId,
],
['site_value' => $siteValue]
);
}
Filter Management: Binding Attributes and Facets
Each mapped attribute bound to site filter:
CREATE TABLE filter_attributes (
filter_id int REFERENCES filters(id),
attribute varchar(100),
display_name varchar(200),
sort smallint,
PRIMARY KEY (filter_id, attribute)
);
After mapping, values written to denormalized product_filter_values table — exactly what faceted search uses:
CREATE TABLE product_filter_values (
product_id int,
filter_id int,
value varchar(500),
value_slug varchar(500),
PRIMARY KEY (product_id, filter_id, value)
);
CREATE INDEX pfv_filter_value_idx ON product_filter_values (filter_id, value_slug);
Unrecognized Attributes: Processing Queue
All attributes without mapping accumulate in queue:
CREATE TABLE unmapped_attributes (
id serial PRIMARY KEY,
supplier_id int,
attribute_name varchar(200),
sample_values text[], -- up to 10 examples
occurrences int DEFAULT 1,
first_seen_at timestamptz DEFAULT now()
);
In admin — table with frequency. Most frequent attributes are first candidates for mapping creation.
Implementation Timeline
- Basic attribute name mapping (dict + query) +
product_filter_valueswrite — 3 days - Auto-normalization of numeric values (weight, sizes, voltage) — +1 day
- Fuzzy suggestions + unmapped queue + admin UI — +2–3 days
- Full faceted search integration and recompute on mapping changes — +1–2 days







