Implementing Automatic Product Description and Specification Updates
Product descriptions and specifications are content that suppliers regularly expand and refine. New GOST, corrected technical parameters, added certificates — all of this should reach the catalog without manual work by editors. At the same time, if a content manager manually rewrote a description, automation shouldn't overwrite it.
Data Structure for Managed Updates
Key principle: separate data source (supplier) from final content (what's shown on site), with a "manually edited" flag.
CREATE TABLE product_content (
product_id int REFERENCES products(id),
source varchar(30), -- supplier_id or 'manual'
field varchar(50), -- description | spec_weight | spec_color ...
value text,
is_manual_override boolean DEFAULT false,
supplier_value text, -- last value from supplier
updated_at timestamptz,
PRIMARY KEY (product_id, field)
);
On auto-update: if is_manual_override = true — update only supplier_value, not value. Content manager sees the discrepancy in interface and decides whether to accept supplier's change.
Description Sources
Supplier XML Feed
Most manufacturers provide XML with extended attributes:
<product article="ABC-123">
<description lang="ru"><![CDATA[Detailed description...]]></description>
<attributes>
<attribute name="weight" unit="kg">2.5</attribute>
<attribute name="color">Black</attribute>
<attribute name="material">Stainless Steel</attribute>
</attributes>
</product>
Parser in PHP:
class XmlDescriptionSource implements DescriptionSourceInterface
{
public function fetch(): iterable
{
$xml = new \XMLReader();
$xml->open($this->url);
while ($xml->read()) {
if ($xml->nodeType === \XMLReader::ELEMENT && $xml->name === 'product') {
$node = new \SimpleXMLElement($xml->readOuterXml());
yield $this->parseProduct($node);
}
}
$xml->close();
}
private function parseProduct(\SimpleXMLElement $node): array
{
$data = [
'sku' => (string) $node['article'],
'description' => (string) $node->description,
'attributes' => [],
];
foreach ($node->attributes->attribute as $attr) {
$data['attributes'][(string) $attr['name']] = (string) $attr;
}
return $data;
}
}
XMLReader reads file streaming — doesn't load entire XML into memory, critical for catalogs of 100,000+ items.
API with Partial Update
If supplier provides changes endpoint:
GET /products/updates?fields=description,attributes&since=2024-01-15T10:00:00Z
Returns only products where at least one specified field changed — significantly reduces processing volume.
Updating Specs with Normalization
Supplier sends specifications in own format — need to bring to internal site schema:
class AttributeNormalizer
{
// Mapping supplier attribute names → internal keys
private array $nameMap = [
'weight' => 'spec_weight_kg',
'масса' => 'spec_weight_kg',
'вес нетто' => 'spec_weight_kg',
'color' => 'spec_color',
'цвет' => 'spec_color',
];
public function normalize(string $supplierName, mixed $value): ?array
{
$key = $this->nameMap[mb_strtolower(trim($supplierName))] ?? null;
if (!$key) return null;
return ['key' => $key, 'value' => $this->castValue($key, $value)];
}
private function castValue(string $key, mixed $raw): mixed
{
return match(true) {
str_starts_with($key, 'spec_weight') => (float) str_replace(',', '.', $raw),
default => (string) $raw,
};
}
}
Job Chain for Content Update
Description updates are heavier than price updates — large content, need to normalize attributes, compare with override flags. Optimal scheme: separate queue with low parallelism.
class UpdateProductDescriptionsJob implements ShouldQueue
{
public int $tries = 3;
public int $backoff = 60; // seconds between retries
public function handle(
DescriptionSourceInterface $source,
AttributeNormalizer $normalizer,
ContentUpdater $updater,
): void {
foreach ($source->fetch() as $item) {
$productId = Product::where('sku', $item['sku'])->value('id');
if (!$productId) continue;
// Description
$updater->updateField($productId, 'description', $item['description']);
// Specifications
foreach ($item['attributes'] as $name => $value) {
$normalized = $normalizer->normalize($name, $value);
if ($normalized) {
$updater->updateField($productId, $normalized['key'], $normalized['value']);
}
}
}
}
}
ContentUpdater Logic
class ContentUpdater
{
public function updateField(int $productId, string $field, mixed $newValue): void
{
$existing = ProductContent::where([
'product_id' => $productId,
'field' => $field,
])->first();
if (!$existing) {
ProductContent::create([
'product_id' => $productId,
'field' => $field,
'value' => $newValue,
'supplier_value' => $newValue,
]);
return;
}
// Always update supplier_value to show discrepancy
$existing->supplier_value = $newValue;
// Rewrite only if no manual override
if (!$existing->is_manual_override) {
$existing->value = $newValue;
}
$existing->updated_at = now();
$existing->save();
}
}
Schedule and Priorities
| Data Type | Frequency | Reason |
|---|---|---|
| Specifications (sizes, weight) | Once daily | Change rarely |
| Descriptions | Once daily | Large volume, not urgent |
| Certificate statuses | Once weekly | Change even rarer |
| Prices | Every 15–30 min | High volatility |
Discrepancy Interface in Admin
If value != supplier_value AND is_manual_override = true, show warning in product interface: "Supplier changed value. Current: X, new from supplier: Y. Accept?" with "Accept" and "Keep" buttons.
Implementation Timeline
- One XML source, description and attribute update without overrides — 2–3 days
- Attribute normalizer with mapping table + override mechanism — +2 days
- Discrepancy dashboard in admin — +1–2 days







